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PREFATORY NOTE. 


To each set of Lectures delivered before the Institute of 
Actuaries, when published in book form, there has generally 
been prefixed a short preface, or introduction, written by the 
President of the Institute then in office. This course, 
admirable in itself, cannot well be followed on the present 
occasion, having regard to the fact that Mr. Harpy has, in the 
interval between the delivery of the Lectures and their 
publication, himself been elected to the Presidential chair. 
It, hag therefore devolved upon us, as Honorary Secretaries of 
“the“‘Institute, to insert this foreword in explanation of a 
seeming omission, and to express therein the confidence of the 
Council that the Lectures will be found to be of the greatest 
interest and value to the profession, which already owes so 
deep a debt of gratitude to their author. 
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PREFACE. 


Tre object of the following Lectures was to deal with the 
theoretical considerations that should govern the selection 
and treatment of such statistics as form the basis of the 
various tables of mortality, sickness, secession, marriage, 
superannnation, etc., which are of use to the Actuary. It 
should be noted that in nearly all cases where mortality 
tables arc spevially referred to what is said may be extended 
to other types of statistics, though, to avoid repetition, that is 
not always pointed out, 


Some apology is required for the long delay in the publica- 
tion of the Lectures. It was intended subsequently to their 
delivery, to oxpand them into something like a complete 
treatment of the subject (from the theoretical point of view), 
wid to add a sufficient series of examples to illustrate the 
various points of theory. Unfortunately I have not found 
time to carry out this intention, but as regards that part of 
tho subject dealing with the use of the Pearsonian Types of : 
Frequency Curves in Statistics this has been rendered un- 
necessary by the appearance of Mr. Exprrton’s admirable 
. book upon “Frequency Curves and Correlation ’’, published 
by the Institute of Actuaries in 1906. 


A few additions have, however, been made to the 
Lectures as originally delivered, and where these appeared 
to interfere with the continuity of the text they have been 
relogated to notes placed at the end of the Lectures. 


I have very specially to thank Mr. G. J. Lipsronu, F.I.A., 
for several valuablo suggestions, in particular for the con- 
tribution of Notes, and’ for assistance in preparing the 
lectures for the Printers; and also Dr. James Bucuanay, 
M.A., F.LA., F.F.A., for having kindly revised the proofs ’ 
and checked the algebra and numerical work. 


G.F. Bs, 


The Theory of the 
Construction of Tables of Mortality 


AND OF 


Similar Statistical Tables in use by the Actuary. 


BY 
G. F.. HARDY, F.1LA: 


FIRST LECTURE. 


Wuew the Council asked me to deliver a series of lectures 
upon some subject connected with Part III of the Institute 
Examination I selected the construction of mortality and 
similar statistical tables, mainly because it seemed to me to lie 
at the basis of our work. Actuarial science, in the modern 
sense of the term, had its origin in the collection of statistics 
(however rough and inaccurate these may have been), and their 
use for the purpose of calculating life contingencies; and 
although the Actuary has now to take account of a wider range 
of subjects than formerly. the collection and analysis of past 
experience and the employment of the results of such analysis 
to forecast the future is still his most important function. 

The title of the lectures is somewhat wider and more 
ambitious than the contents may be found to warrant. To 
justify it fully would involve dealing with many questions of 
detail relating to the collection and tabulation of data, such, 
for example, as the various methods for computing the 
numbers exposed to risk in a mortality experience, dc., 
which have been many times discussed in the volumes 
of the Journal of the Institute of Actuaries and many of 
which are exhaustively dealt with by Mr. Ackland in the 
recently published “ Account of Principles and Methods.” It 
is evident that to deal with the subject in such detail, would 
outrun the limits of the six lectures which I have undertaken 
to deliver. I propose, therefore, to confine myself mainly to 
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a ronsideration of the general principles involved in the 
collection of statistical data, and in the construction from 
such data of tables, of which the Mortality Table is the best 
known and the most important, embodying the results in the 
form required by the Actuary, and, at the same time, to give 
such examples of the application of these principles as may 
be necessary to illustrate the subject. 

In this opening lecture in particular, I shall ask your 
indulgence if occasionally my remarks appear to be of an 
elementary character, as I think it desirable that we should 
be perfectly clear as to first principles before going on to 
more detailed consideration of the subject. 

Statistical tables, in one form or another, are familiar to 
all of us. .At the basis of all such tables, and, indeed, of the 
whole science of statistics, lies one of the most fundamental 
facts in nature, namely, that all phenomena of which we 
have any knowledge fall into certain classes, groups or series, 
and cluster round certain types. But for this fact we should 
be unable to classify our knowledge, indeed, should never 
have acquired any to classify. Speaking broadly, then, every 
object and every event that comes within our observation is 
one of a group or class of similar but not identical objects or 
events, which, as a class, is marked off by certaan special 
features from every other class, although the dividing line 
may not always be sharply drawn. These groups or classes 
are not arbitrary, but are inherent in the nature of things, 
although it is true that the particular groups which we employ 
in classifying our knowledge are chosen with a view to our 
own convenience and to the limitations of our minds. 

From a consideration of a class of objects as a whole, we 
get a conception of an average, or type,* to which each 
individual in the class more or less conforms, but from 
which, notwithstanding, every individual also diverges. Such 
divergencies or variations of individuals from the average 
type may be discontinuous, themselves running into types, o1 
they may be continuous. Among the individuals forming 
together the type mankind, are divergencies such as those 
due to sex, race, nationality, birthplace, occupation, civi 
condition, dé&ec., discontinuous variations producing sub. 
groups, the boundaries of which overlap and interlace, eacl 

* The type of the class should preferably be considered as rep: 


» ted by th. 
“mode” or case of most frequent occurrence rather th re aie 
“mean”, but this point is not here of importance, eo age : 
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of these smaller groups again being capable of endless 
subdivision. These divergencies can be dealt with statistically 
only by counting the members of the various sub-groups. i 

On the other hand, there are divergencies, which we may 
term continuous, such as those due to differences of age, 
height, weight, income, &c., &c., differing from the Prey 
class in that they do not involve the separation of the main 
group into sub-groups, but relate to qualities, possessed by 
each member of the group in varying degree, capable of 
measurement and numerical statement, and involving the idea 
in each instance of an average. Thus we can speak of the 
average age, height, or income of a group of persons, not of 
their average occupation or nationality, although we may 
speak of the average constitution of the group in respect of 
these latter qualities. 

A statistical table deals with some natural group of 
objects or events and is a numerical statement of the manner 
in which the members of the particular group differ inter se in 
respect of some special character or characters. If dealing 
with discontinuous variations, as for example a table showing 
the occupations of a group of persons, it will exhibit, implicitly 
or explicitly, the ratio of the magnitude of each sub-group to 
the whok, at a given moment or moments or on an average of 
@ given period; or it may take the form of a statement of the . 
extent to which variations in one respect are affected by 
variations in another, as, for example, a table showing the 
proportion of the sexes in different nationalities. If dealing 
with continuous variations, it will either represent a series of 
measurements of some quality common to members of the group, 
showing its average value for the group, and the manner in 
which individual values are grouped round such average, or it 
may represent, numerically, the manner in which deviations 
from the average in respect of some one quality A are corre- 
lated with the deviations in respect of some other quality B. 

It is mainly with the class of statistical table dealing with 
continuous variations that the Actuary has to deal ; variations 
in the ages of lives under observation, their ages, or the 
periods elapsed since entry, at death, withdrawal, marriage, 
superannuation, &c. Insuch tables the grouping of individual 
measures round the average will, in general, but not always, 
be found to follow, approximately, certain well-defined laws. 
Taking first the tables dealing with a single variable, the 
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} i cample. It is a 
following may be considered as an exaimp! 
ne of the heights of 2,192 school children, and is 
abridged from that given in a paper by Prof. Karl Pearson. 


TaBE I. 
Showing heights of 2,192 School Children, aged 12 years. 


i 
Computed—Observed | 


Heights in Wo. of Childred | Ope eter deer 
Centimetres Observed Ke~xtlc? a a 
4 > WER SN ee See a 
(a) (2 (3) (4) (5) 
| 189-140 1 if ma i 
135-138 6 3 ref 
131-134 31 25 - 6 
127-130 107 119 12 
| 193-196 321 338 7 ee 
‘| q1g-199 585 877 ef 8 
ls Bliss 618 596 bas 22 
bee S1T—11S 359 365 6 , 
107-110 126 135 9 af 
| 103-106 35 30 om 5 
99-102 3 4 1 
os 
| Total 2,192 2,192 45 45 


Norz.—In the formula (col. 8) x represents the deviation in centimetres 


2192 
from the average; c=7°76 and x has such a value as to make the arca 
vv 


of the graduated curve equal to the ungraduated ; that is, to make the totals of 
columns (2) and (8) equal. 


If we consider the progression of the numbers in 
column (2), we shall see that they form a roughly symmetrical 
series, being largest in the neighbourhood of the average 
. height and diminishing gradually on either side. It will be 
seen that the average height is about 1183*, the number 
exceeding this height being approximately equal to the 
number falling short of it. In order to bring out tho 
approximate law of the series, I have inserted in column (3) 
the computed numbers on the assumption that the frequency 
of a deviation of +@ centimetres from the average is 


_Tepresented by the function xe-***, where c has the value 7-76 
2192 : sie 
and « the value re The expression «e~*" , represents 
; T 
what is usually termed the curve of “ 
or the “normal” curve of frequency. It will be seen that 
_while the figures in column (2), are as we should expect 
them to be with such limited data, somewhat irregular, they 
conform on the whole fairly closely to the normal curve. 


facility of error”, 
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The ‘normal curve ” was first used to represent the dies 
tribution as to magnitude of errors of observation in physical 
measurements. It must not be regarded as representing a law ‘ 
of Nature, but rather an extremely convenient and. often very 
close approximation to observation ; experience proving that 
in many cases errors of observation and the deviations of 
individuals from the mean of a class do follow very closely 
the law referred to. The formula is therefore empirical and 
not to be established by a priori reasoning; at the same time 
we may, perhaps, see a logical basis in the following 
consideration. We may suppose that, in any individual 
measurement, the deviation from the mean of the class (as 
the difference in the height of any individual among the 
2,192 in Table I from the average height of the whole 
group) is the result of an infinity of minute causes as to 
whose nature we are in ignorance, any one of which may 
produce a minute positive or negative deviation from the 
average. These minute superimposed deviations being 
indefinitely small and indefinitely numerous, we may without 
loss of generality assume them of equal magnitude. It is 
then clear that the magnitude and sign of the total resulting 
deviation in any given case will depend upon the extent to 
which th@ number of these minute positive deviations exceed 
the negative, or vice versa. ) 

If the number of possible causes of deviation is 2n, 
and if the extent of each indefinitely small deviation is k 
(n being indefinitely large, but kn finite), then the, = 
probability or “frequency” of a total deviation lying between(1t5;., x. 

z i i 


: "ys i a ’ Z| 
« and a+k will depend on our having (n+ 5 positive + (7° fey fH) 


values of k and (2— ) negative values. The probability 


a 
2h, 
of this occurring will be represented by the appropriate term 
in the expansion of the binomial (4+ 4)” or 


ht i ET 
x x 
in + oh i) Ok 


It may easily be shown that this expression, n being 


indefinitely great, takes the form . ° 
1 Lae et 
———e—a ~, ve. (Constant) x e ~ a 
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i.e., of the curve of the “facility of error.” Ido not propose to 
discuss at any length the properties of this particular curve,{ 
but you will notice that the curve being symmetrical with 
respect to positive and negative values of x, it assumes 
that positive and negative deviations of a given magnitude 
are equally frequent, the average magnitude of such devia- 
tions being small or large asc is small or large. The maximum 
ordinate corresponds to the value of «=0, which is the 
average value of #; it therefore passes through the centre 
of gravity of the area enclosed by the curve and the axis 
of z, and also divides that area into two equal parts. It 
assumes that indefinitely large deviations are possible, hence 
it cannot be rigidly exact, because when dealing with physical 
measurements of any kind, indefinitely large errors are not 
possible. This is not a practical objection to the use of the 
formula, however, as the probability thereunder of deviations 
of many times the average value is extremely small. 

The following table, showing the number of entrants in 
various aged groups in the O™ Experience, exhibits a quite 
different distribution of the deviations from the average : 


TisiE IL. 
Number of entrants in quinary age groups O™ dgta. 
a a eden acres at eae 
Compnted—Actual 
| ; Sane Actual Entrants | Computed No. i 
fa P in Group* by Formulat 
| + a 
(1) (2) (8) (4) (5) 
F 20 431 436 5 
i 25 1,278 1,305 32 
30 1,526 1,473 ss 53 
35 1,269 1,265 4 
40 914, 930 16 
| 45 591 604 13 
50 354, 349 5 
' 55 | 182 178 44 
60 83 "9 4, 
65 26 29 3 
| 70 7 8 1 


Rr | ee. b 
cece 


| Totals | 6,687 | 6,657 | 70 | 70 


* Omitting hundreds. 


t Formula representing number of entrants at given age t= k(~ — 18-59) 00 


TThe student may consult Woolhouse’s paper on “The Philosophy of 


or an sxbanaire analysis of the properties 
Elements of Statistics ”, Part II, Sec. II. Barbar tildes ge 
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Here the numbers also exhibit a well-marked 4 
governing the deviations from the mean, but this law is no 
longer the same as that shown by the “normal” curve of 
frequency. The maximum ordinate does not coincide 
either with the average age or with the central age 
of the series; while the number of cases exceeding the 
average age no longer equals the number falling short of it. 
In other words, the curve is non-symmetrical or skew. It 
follows very approximately, however, a certain law, as will 
be seen by comparing the numbers in column (2) with 
those in column (8), which represent the computed numbers 
according to the formula stated. 

Having regard to the fact that the numbers in column (2) 
represent 100’s and not units, the differences between the 
actual and computed numbers are somewhat outside the 
probable errors of observation. There are, that is to say, 
“systematic” differences between the two curves. These 
systematic differences are generally to be expected in dealing 
with age statistics. It will be seen that they are not 
incompatible with a close agreement in the general features 
of the two curves, but they serve as a warning that, in 
statistics of this nature, formule representing the 
distribution of deviations from the mean must be regarded 
as approximations only. 

If we consider the curves exhibited in Tables I and II we 
see that the general character of such curves is determined 
by a few salient features : 

1. The position of the maximum ordinate; that is, the 
value of the variable having maximum frequency. 
This value is termed the mode. 

2. The average or mean value of the variable, being the 
arithmetical mean of all individual values. In a 
symmetrical curve this coincides with the “ mode.” 

3. The average deviation from the mean, corresponding 
to the closeness with which the individual measures 
are grouped round their mean value. There is a 
certain convenience, for analytical reasons, in 
adopting as our standard in this respect either the 
mean of the squares of the individual deviations, or 
the square root of this quantity. Thé latter is 
termed the standard deviation. We may represent 
the average of the squares of the deviations, or the 
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“mean square” deviation by the symbol po, When 
the standard deviation becomes »/ pa" ; 
4, The equality or otherwise of the positive and negative 
deviations from the mean; that is, the symmetry or 
skewness of the curve. The sum of the first powers 
of the deviations is, of course, always zero.. If the 
curve is symmetrical, the sum of any odd power of the 
deviations must be zero, but not otherwise. As we 
have employed the square root of the average 
square of the deviations as a measure of the 
diffuseness or spread of the curve, termed the 
“standard deviation”, so we may take the ratio of 
the cube root of the average cube deviation to the 
“standard deviation” as the standard of 
“skewness.” If we represent the average cube 


deviation by the symbol ys, the skewness of the curve 
3 
may then be measured by vie ‘ 


The skewness is sometimes taken as the difference between 
the “mean” and the “mode”, divided by the standard 
deviation. 

The sums of the successive powers of the deviations 
of the variable from the mean, the area of curve being 
taken as unity, are termed the moments of the curve. 

These observed laws of the variation of measurements 
from their mean are very general, and are usually, though not 
invariably, associated with what is termed “ homogeneous ” 
data. The distinction between “homogeneous” and “hetero- 
geneous” data is of considerable importance, although not 
very easy to define. We may perhaps define a homogencous 
group as one in which the continuous variations are from a 
single type only, and are unaffected by any discontinuous 
variations in the group if these exist. These conditions will 
hardly ever prevail, buta group may be considered for practical 
purposes as homogeneous if the variations in the particular 
quality dealt with are not materially affected byany discontinons 
variations existing in the group. If, however, the group can 
be split up into two or three distinct series differin g markedly 
in certain qualities, and these differences are found, or may 
reasonably be supposed, to affect the character under 
examination, then the series is “ heterogeneous,” 


Take, for example, the class representing assured lives of 
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e 
a given age, but of varying duration of assurance, and assume 


we are investigating the rate of mortality of the class. If it is 
found on examination that the duration of assurance materially 
affects the rate of mortality, then the data treated as a whole 
is heterogeneous. Ifit is found, however, that the duration 
of assurance after reaching a certain point has no such 
influence, or an influence that is insignificant, then the data 
from this point and in this respect may be treated as 
homogeneous. The same considerations apply to distinctions 
in class of assurance, amount of policy, occupation, &c. 

The laws which appear to govern deviation from the 
average in homogeneous data are, in general, so uniform in 
action that a departure therefrom will frequently indicate 
that data which might be supposed to be homogeneous are not 
so. An interesting illustration of this may be seen in the 
case of the Male Annuitants in the New Offices’ Annuity 
Experience. Consider the following table showing the number 
of entrants for various groups of ages :— 


Tarte LI. 


‘Mate Ayyurrants 07 Dara. 
« Number of entrants at various ages, 1863-1893. 


' 
R Conpatet. | Observed — Computed 
umbers 
at Entry Entrants H 
53-1893 -65\2 
z 1863-189 nimken (=®) | 


| 
| 
nana ress gs 7A 
| 
| 

| 

| 

| 


These particular age groups are selected as there appears 
to be aslight excess in the number of entrants at decennial 
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snare uettinal ages, and by placing these in the middle 


of the groups we get rid of the disturbance, which would 
otherwise affect the numbers. 

An examination of the numbers in column (2), between 
ages 58 and 78, shows that they form a nearly symmetrical 
curve, as is seen by a comparison with a “normal” curve of 
frequency given in column (83).* The numbers above age 78, 
however, are in defect, and those below 53 are considerably in 
excess of the figures suggested by the normal curve. As 
regards the falling-off of the numbers at the older ages, it 
may be conjectured that it is in part due to the fact that many 
published tables of the cost of annuities cease at age 75 or 
80. The observed excess in the number of entrants at ages 
below 50 evidently represents the entrance at these ages of a 
class of lives differing from those forming the bulk of the 
data. It may perhaps be conjectured that a number of these 
cases are counter lives in contingent reversions, or similar 
securities, upon whose lives annuities have been purchased to 
secure the payment of annual premiums. Be that as it may, 
we find that while the deficiency of entrants at the older 
ages does not appear to affect the mortality rates, the entrants 
at the younger ages on the contrary show abnormally heavy 
mortality, the ungraduated values of the expectation of life 
for entrants under age 55 being relatively low. Hence we 
may calculate that the male annuitant experience is hetero- 
geneous, and in using the results as a basis of calculation 
for the future, the abnormal part of the experience representing 
the entrants at the younger ages was properly rejected. 

In addition to tables of the kind we have been considerin g, 
a statistical table may be a numerical statement of the 
manner in which variation in one particular from the average 
of the group is accompanied by variation in some other 
particular. We may, for instance, have a table representing 
a number of individuals, arranged according to height, the 
numbers at each height being further arranged according to 
weight. We should then have a'table of double entry, each 
row or column of which would represent a statistical table of 
the form already considered. By means of this table we should 


e hy ie ie yey NM. 
be able to correlate”, as itis termed, variations in respect to 


* The constants of this curve were o 
agreement with the observed numbers betw 
close to illustrate the point under discussion, 


nly roughly determined, but the 
een ages 53 and 78 is sufliciently 
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«@ 
weight with variations in respect to height. Such a table 
would represent a mass of figures, the bearing of which could 
not easily be grasped without some further analysis. If, 
however, we add to the table a column showing the average 
weight for persons of a given height, we then have a ready 
means of seeing how this average weight is affected by a 
change in height. Having inserted the average, we have not 
exhausted the information which the original figures give us. 
We need also to know to what extent on the average the 
weight varies when the height remains constant; that is, we 
need to insert against each average weight what we have 
termed the “standard deviation.” 

A familiar example of such a table is one showing the 
ages of husbands and wives at marriage. Such a table would 
take the following form— 


Taare IV. 
Showing Ages of Husbands and Wives at date of Marriage. 


Wives’ Aces 


Tushands* .- mem eetett  Srakeeens = 
ACS 
% > Mean A; 
under | 20-30 | 30-40 | 40-50 | 50-60 | 60-70 | of 
2 | Wives 
LRG es 2 . a <n 
under 20 13 5 = cl ear Bs, 17°8 
20-30 215 500 16 KE ee aa 22°3 
30-40 14 107 39 4, 27°0 
40-50 HE 14 Zou Laas 2 oa 35:0 
50-60 ove 2 Gas, 9 4 aaa 42°1 
60-70 ee > Ls 3 4 2 52:0 
70-80 eo 1 1 1 55:0 
- eS —| — 
Mean | 


49°0 | 58°6 68°3 —_ 


\ 


Ages of 25°71 27°2 37°6 
Husbands 


Tf there were no correlation between the ages of the 
husbands and wives at marriage, the figures showing the 
average ages for the various columns would (except for 
accidental fluctuations) be identical, and the same would hold 
for the average ages of the successive rows. 

J£ a line were drawn through the table cutting those 
points in the rows corresponding to the averagg ages, and 
another line similarly cutting those points in the columns 
representing average ages, it would be found that these 
points could roughly be represented by straight lines, which 
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in the present example would be nearly coincident, since the 
spread of the figures, as measured by their standard deviation, 
is very similar in both rows and columns. 

It is not always the case, however, that the nature of the 
correlation can be represented by a straight line. In the 
following example we have a somewhat different class of 
table showing the proportions for different age groups of. 
wives and widows in an Indian pension fund. 


Taste IVa. 
Showing proportion of Wives and Widows in a Pension Fund. 


ee Widows, 
Ages | og as ales hae ; Total per-cent of 
i j | : Total 
| 
under 20 | 19 | 19 
20-30 | 1,430 ; 50 1,480 
30-40 | 3,366 ' 355 3,721 
40-50 = 3,329 1,018 4,347 
50-60 1,653 1,312 2,965 
60-70 ; 476 . 933 1,409 
70-80 i 63 ; 330 393 
80-90 / 6 i 46 52 


Here it will be seen, from the run of the figures in the last 
column, that they cannot be well represented by a Straight 
line, being somewhat in the form of the curve of fe"? o* da, 

F 
or of the curve —~ 5 
m+a* 


the limits. 

Such a table of correlation has an analogy with the table 
of the “Exposed to Risk” and “Died”, which ordinarily 
forms the basis of our Mortality Tables. This table is 
virtually in the following form—column (4) representing the 
number of annual survivors being usually omitted as being 
implicitly contained in columns (2) and (3)— ‘ 


with values of 0 and 1 respectively at 


Table of Exposed to Risk and Died. 


age | Sxposedto Risk | Died 


(1) ey | 


iden: 


SESEEEEEE ee 


Survived 


(3) ery (4) 
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We have here the ages of the persons observed;* the 
numbers under observation, or “ Exposed to Risk”, which, 
for the sake of simplicity, we will suppose to remain under 
observation for the entire year of age; the number of those who 
die during the year, and of those surviving. If we represent 
the rate of mortality by q, then in all cases in column (3) 
qz2=1, and in all cases in column (4) g,=0, and we have a 
table which is analogous to the table of the weights of 
individuals of respective heights, only that instead of having 
various values of g,, we have in the nature of things only 
two possible values 0 and 1, the average value for each group 
representing the observed “rate of mortality.” This table 
differs from that correlating weights and heights, or ages of 
husbands or wives at marriage, agreeing with that correlating 
age and civil condition, in the fact that a certain quality 
or characteristic, in this case death during a given year 
of age, is not present in varying proportions, but is 
either present or entirely absent. We are thus introduced 
to the conception of probability, the proportion of any 
group surviving or dying representing the “ probability” 
.of survival or death for any individual of the group taken 
at random. The idea of probability is also present in 
the supposed table of weights, although not so obviously. 
That table would inform us, for example, of the probability ~ 
of a person of given height excceding or falling short of 
a certain fixed standard weight, and we should then have 
a table identical in form with the table .of Exposed to 
Risk and Died. 

This conception of probability is important to the Actuary, 
because his object in collecting statistics is the distinctly 
practical one of measuring the probability of the happening 
of certain contingencies. It is necessary to realise clearly 
what is meant by the statement that the probability of a 
particular ovent has this or that value. Laplace pointed 
out that when we speak of the probability of the happening 
of a givon event, wo do so only on account of our ignorance 
of the antecedents of the event, or our inability to completely 
analyzo them. I£ we entirely knew the antecedents, and if 
our powers of analysis were equal to the task, we could 
predict the event. In many cases we are able to do this 
approximately, but where the effective causes at work 
are numerous and obscure, and the result in individual 
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(apparently similar) cases is very variable, as in all questions 
affecting life contingencies, we are unable to forecast the 
event in a given case, and must fall back upon the average 
result deduced from the examination of a large number of 
similar cases. In other words, we treat the particular case in 
question as one of an indefinitely large class of sumilar cases, 
a sample of which we have already had under examination. 
From the results of such examination we infer the composition 
of the class as a whole, and hence the “ probability” or 
average event in an individual case. If, in the sample 
observed, a given character is present in a certain proportion 
of cases, as for instance, where out of a number of persons of 
given age under observation, a certain proportion have died 
within the year of age, then we estimate the probability of 
the event happening in a particular instance, by the ratio 
which the number of cases in which the event has occurred 
bears to the entire number of cases observed.* To determine 
the probability of a given event is therefore to assign the 
case to the natural group or series to which it properly 
belongs and to pass under examination a sample of the group 
sufficiently large to enable us to determine approximately the 
average character of the whole as regards the particular 
quality in question. We are here speaking of simpleeevents ; 
the probability of a complex event, such as the survival of 
one life by another, is, of course, not determined directly by 
past observations. The latter yield the simple probabilities 
of surviving each year of age, by suitably combining which 
we arrive at the value of the probability desired. 

The degree of certainty with which we can deduce the 
properties of an entire class from the part known to us, 
depends first on our assurance that the class is homogeneous, 
or at least that the portion observed is representative, such 
as would result from a selection of cases made at random, and 
secondly, on the number of cases that have been under 


* The formula deduced by Laplace by which the true probability of an 


event which has been observed to happen m times out of m+n trials is taken as 
m+1 


minis 38 Obviously not applicable to such a function as the rate of mortality, 


nor to any analogous function. It is sufficient to consider that in tabulating the 
values of the probability of dying in each year of age, we are using an arbitrary 
unit of time which might just as well be a month or day, in which cases we should, 
by hoe the above formule, produce quite different mortality tables from the 
same data_ 
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observation. If we examine the figures in tables similar 
to Tables I and II, we see that, in proportion as the number 
of cases under observation is small, the figures representing 
the results of the experience are irregular, while, on the other 
hand, where the number of facts observed is very large, the 
irregularities become relatively less. We arrive at the same 
conclusion from theory. If an indefinitely large group N 
contains Np objects of class A and N(l—p) objects not of 
class A, and if from the group » objects are selected at 
random, then on the average mp of these will be of class A. 
If we represent the observed number in any given case as 
ap+z, the average algebraical value of z will be zero, 
while its average numerical value, irrespective of sign, will 
be very nearly */;/np(1—p).* This latter quantity clearly 
increases as 2p increases, but at the same time its ratio to 
mp diminishes. Thus in a table of exposed to risk and died 
the actual irregularities in the number of deaths increase 
with the magnitude of the experience, but the irregularities 
in the rate of mortality diminish. Hence from theory as from 
experience we derive the conviction that if instead of the 
limited number of facts which we have been able to examine, 
we could have examined an indefinitely large number of 
similar, facts, the results would have been relatively free 
from irregularity, and capable of being expressed by a 
continuous curve; without, of course, being sure that any 
such curve could be expressed algebraically. 

The idea underlying the graduation of the figures of a 
statistical table, whatever be the process employed, is that a 
continuous curve may be found representing the general trend 
of the observations freed from irregularities due to paucity 
of material. This curve, we have reason to believe, will 
correspond more closely than the ungraduated curve to the 
results obtainable from a much larger body of facts. This is 
the rationale of the process of graduation and its justification. 
Such a process cannot deal with systematic errors affecting 
the table as a whole and cannot compensate for inadequate 
data. It adds weight to the results, however, at each 
individual point of the table, and-assists in bringing into 
relief the true character of the curve by freeing it, in a 
large measure, from accidental irregularities. 


* The average value of 2° will be npg, the average value of 2° will be 
npq(p—q), and the average value of 2‘ will be npg [(3n—6)pg+1]. See Note A,p.110. 
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There may be other objects aimed at in a graduation 
besides that of removing the irregularities from the rough 
figures, with the view of bringing out more clearly the law 
underlying them. The Actuary constructs tables not merely 
to show what has happened in the past, but to enable him to 
forecast the future, and as he requires these tables as a basis 
for financial operations, considerations are introduced which 
do not arise in the treatment of purely statistical tables. 
Whatever class of events the Actuary may have to deal with, 
will be subject to change with the lapse of time. That 
portion of the class he has been able to observe hes 
necessarily in the past; the conclusions he has derived from 
their study he proposes to extend to the future. He must 
therefore consider how far the observed characters of the 
class are changing or permanent, and must endeavour 
to distinguish between changes representing permanent 
tendencies and those due merely to temporary fluctuations. 
In the selection of data suitable for his purpose the Actuary 
will aim on the one hand at a sufficiently broad basis both in 
space and time to eliminate the effects of local and temporary 
fluctuations, and on the other hand he will aim at obtaining 
as far as possible a homogeneous group of data. These two 
aims are more or less in conflict, and he will lean togthe one 
side or the other, according to the object he has in view. 
Where, for example, that object is to produce a table that 
may be adopted as a general standard by various institutions, 
often differing considerably as to their individual experience, 
he must aim at a correspondingly broad foundation. In 
these circumstances it will not generally be possible to obtain 
a really homogeneous experience. If it is a question of the 
mortality of assured lives, for instance, this will be found to 
be affected by endless individual variations, age, sex, duration 
of assurance, occupation, civil condition, class of assurance, 
character of the insuring office, &e., &c., and from such 
material approximately homogeneous data could only be 
obtained by cutting up the experience into comparatively 
small groups and thus sacrificing all generality. This can be 
avoided in practice by first excluding all extreme variations. 
- The sexes will be separately treated, lives so impaired as to 
prospects ef longevity by personal health, family history, 
cron or residence in unhealthy districts as to be “rated 
-up” will be excluded, as also classes of assurance that may 
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be supposed subject to rates of mortality differing from the 
average. When the data has thus been trimmed of the 
extreme variations, a body of experience will generally 
remain not greatly shrunken from its original dimensions 
and in which the discontinuous variations are sufficiently 
numerous and individually unimportant to render the data 
for practical purposes homogeneous. The rates of mortality, 
or of withdrawal, can then be treated as functions of the two 
remaining variables of importance, the age and the time 
elapsed from date of entry; or as functions of the age only 
from the point at which the factor of duration may be found 
to be unimportant. 

On the other hand, the Actuary’s object may be precision 
rather than generality; he may have to deal with a group, 
subject to special conditions and presenting special 
characteristics, as is usual in the case of pension funds and 
friendly societies. Here, if the data are at all adequate, better 
results will be obtained therefrom than by having recourse to 
any general experience. Where it is insufficient by itself as 
a basis for statistical tables it may serve as an indication as 
to what standard table is the most suitable to employ and as 
to how far and in what direction it may be desirable to 
introduce any modifications therein. In an experience. of 
this character the data may sometimes be very heterogeneous, 
but there is usually the safeguard that its composition is 
approximately constant. 

A question of some importance may here be considered, 
namely, the relative claims of lives, policies, or amounts 
assured to form the basis of the mortality table. In the 
17 Offices’ data, the number of policies, in the H™ and O”™ 
data, the number of lives passing under observation 
constitute the basis of the experience, while in the American 
Offices’ Experience (1880) the sum assured was the unit. In 
the instances of the H™ and O™ Tables, wherever a life would 
have been doubly observed the duplicate assurance was 
eliminated. In justification of the use of the sums assured 
as the basis of the experience, in lieu of the number of lives, 
it may be said that in this way we represent the financial 
effect of the mortality, as it makes. no difference to the 
insuring company whether one claim arises for £10,000 or 
one hundred claims for £100 each. There are, however, serious 
objections to employing the sums assured as a basis for a 
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mortality table, based upon a general experience. Hither the 
mortality among the lives carrying large sums assured is 
similar to the average or it is not. If it is similar, the 
general character of the table will not be affected by the 
additional weight given to these lives in the experience, but 
the irregularities in the deduced rates of mortality will be 
- considerably increased. The result, indeed, will be virtually 
the same as if we had used a part only of the 
available data, selected at random, imstead of the 
whole. If, on the other hand, the mortality among the 
lives insured for large sums is materially different from 
the average, then the experience is not homogeneous. 
As a matter of fact, these lives of themselves do not form a 
homogeneous group. In certain societies they appear to give 
better rates of mortality than the average ; in others, where 
they are mainly represented by non-profit policies effected for 
commercial reasons, they are no doubt subject to higher rates 
of mortality than the average. As in a general experience, 
combining the individual experience of many offices, these 
lives will represent an exceptional or abnormal element, 
which may or may not persist in the future, and will certainly 
not persist equally in all societies, it is not desirable in 
deducing a general mortality table to specially “weight up” 
this part of the data, 

The same considerations apply, but with somewhat 
less force, to the plan of making policies rather than 
lives the basis of an experience. Without dogmatizing 
upon the point, it appears to me that the proper course is, 
where two or more policies are effected at the same time or 
at the same age at entry, to treat them as a single risk, 
but where the subsequent policies are effected at later ages, 
involving fresh medical selection, to treat them as separate 
risks. This means the elimination of duplicates in each of 
the “ select ” tables for individual ages at entry, but no 
further elimination in the resulting aggregate tables, a course 
which has the advantage of making the aggregate table the 
true aggregate of the tables for Separate ages at ontry. 
Judging by the results of the O™ experience, this courso is 
nce es, flaca agp. tae 

§ ultimate“ rates of mortality after the lapse of 


a stated period from entry, which will join 
on gs 
the “select” rates, sy: oe 
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A detail of less importance, but of considerable interest, 
isthe question of the proper treatment of withdrawals in 
a mortality experience. These are usually treated as 
withdrawing upon the termination of the days of grace in 
case of lapse by non-payment of premium, and for the 
purpose of obtaining the true measure of the mortality 
experienced this course is the correct one. It should be 
borne in mind, however, that to arrive at the financial effect 
of the mortality the numbers of the exposed to risk should 
correspond to the number of annual premiums paid, and from 
this point of view the life withdrawing should not be treated 
at risk during the days of grace. The differences in the 
resulting mortality rates according to the two methods is, of 
course, very slight. 


“SECOND LECTURE. 


Havine dealt in the last lecture with the rationale 
of graduation in general, I now propose to vefer more 
particularly to the principles underlying certain special 
methods of graduation. We may divide the various methods 
which are in use into three classes: 

1. Graphic methods. __ 

2. Methods based upon Interpolation or Finite 
Difference formule, such as Mr. Woollionse’s. 

3. Methods which depend upon the use of Frequency 
Curves, in which we may include all methods 
based upon the assumption that the series to be 
graduated can be represented as some function of 
the variable. 

Certain general considerations apply to all these methods. 
We may have to deal either with a single series of numbers, 
such as the number, at successive ages, of lives effecting 
assurances, of persons enumerated at a census, or attacks 
from a given disease, &c.; or, as more often happens in 
actuarial statistics, the fact of importance may be the ratio 
between the corresponding members of two series of numbers 
as in a table of “ Exposed to Risk” and “ Died’, forming the 
basis of the Mortality Table, where the fact sought is the rate 
of mortality at each age given by the ratio of the Died to 
the Exposed to Risk, the actual numbers of these being 
of importance mainly as affording a measure of the 
trustworthiness of the deduced ratio. 

Where only a single series of numbers is involved, the 
problem is comparatively simple, and au accurate solution is 
not generally of great importance to the actuary. In the 
more usjal case where the ratio of the corresponding 
members of two series of numbers is in question, the problem 
is more complicated. We have a choice of procedure: we 
may either graduate independently the two series of numbers 
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(in the case supposed the numbers of the “ Exposed to 
Ris ' at each age and the numbers of the “Died a) riory 
disregarding the irregularities in the two series, we may 
proceed to deal at once with the ratios only. If each series 
can be satisfactorily graduated, the resulting curves being 
smooth and fitting the ungraduated series sufficiently closely— 
that is to say, within the limits of the errors of observation— 
we may then assume that the ratios of the corresponding 
terms (in the case supposed the rates of mortality) will also 
be within the limits of error. It may also be said that by 
working with the rough facts themselves, rather than the 
ratios between the two, we keep in view the weight of the 
observations at each point of the curve, and are able to see 
at once how far our graduated numbers vary from the 
original, and how far that variation is justified by the number 
of facts at each particular point. There are, however, some 
important objections to this course. In the first place, the 
ratio between the corresponding terms in the two series of 
numbers represents generally a relatively stable quantity, 
whereas the actual numbers in either series, depending as 
they do upon the extent of the experience under review at 
particular ages, are liable to fluctuations of a more or less 
arbitrary*ctharacter. Further, supposing the graphic method 
of graduation or the method of finite differences is employed— 
in either case the argument is applicable, although specially 
so in the former—it will be found that each curve will 
contain certain outstanding irregularities, as it is not possible 
entirely to remove all irregularities by those methods. Hence 
in the adjusted ratios two sets of irregularities will be super- 
imposed and a less satisfactory series of values obtained than 
if the ratios themselves had been dealt with. 

A stronger objection, when dealing with a mortality 
experience, to graduating separately the numbers in the two 
series of “ Exposed to Risk” and “Died” rather than their 
ratio, is that we thereby discard our previous knowledge of 
the nature of the curve expressing that ratio—our general 
knowledge, that is, of the nature of the curve Gx OF f2— 
knowledge which is of considerable assistance in graduating the 
commencement and end of the table where the data are few. 

Where a graduation of both series of numbers“is made, it 
is preferable, indeed necessary if the best results are to be 
obtained, after first graduating the series corresponding to 
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the ‘Exposed to Risk”, to re-compute the numbers of deaths, 
lapses or marriages, as the case may be, on the basis of the 
graduated numbers of the Exposures, and to operate upon 
these adjusted numbers. We are in this way less likely to 
obscure the law of the series representing the required ratios. 

Notwithstanding any theoretical objections, there may be 
occasions on which it is more convenient, or even necessary, 
to deal with the two series separately ; where, for example, 
as in the Registrar-General’s returns of the population and 
deaths for certain occupations, we have not the facts for 
individual ages, but only in certain large groups. The 
ratio of deaths to exposures for each age group are obviously 
not satisfactory approximations to the rate of mortality for 
the central age of the group. In these circumstances it 
appears to be best to adopt a plan similar in principle, 
though not in detail, to that employed by Milne in graduating 
the Carlisle Table, and to draw curves respectively through 
the parallelograms representing the exposures and the deaths, 
and from these deduce the numbers for individual ages. ‘Che 
graphic method, however, is not very suitable for this purpose, 
and the use of interpolation formule does not always give 
good results. It is generally better to make use of suitable 
frequency curves. It will be seen later that, where the 
number of groups is rather small, the use of tho normal 
frequency curve, with certain modifications, enables us to 
re-distribute the numbers representing the groups of 
“ Exposed” and “ Died”, and so obtain graduated numbers 
for each age, and hence from the ratios of these a graduated 
rate of mortality. (See the Sixth Lecture, p. 91.) 


We shall now assume that we are dealing, not with the 
two independent series, but with the ratio betweon the two; 
as, for example, with gz, or some analogous function. 


We may consider we hare three independent estimates of 
the value of Qz:— 


Ist—That derived from the observed ratio of the 
died to the exposed at age z. 

2nd—That derived from the data at nei 
ages. 

3rd—That derived from previous experienc 
or less similar data. 


ghbouring 


e of more 


23 


The first and second should be suitably combined in the 
process of graduation. The last is, in the nature of things 
a very vague estimate, and bears a relation to that derived 
directly from the observations, if these are numerous, similar 
to that of a rough measurement by inferior instrumental 
means to one made by an instrument of precision. In such 
case no weight attaches to it. 

‘There are circumstances, however, in which the a priorr 
estimate of the values of gz become important, viz., when the 
observations at our disposal are extremely few. As the 
extent of our observations diminish, the numbers of exposures 
and deaths becoming smaller, the weight to be attached to 
the deduced values of the rate of mortality become less, and 
a@ point is eventually arrived at when we obtain more 
trustworthy results by considering to what particular class 
of examined data the experience most nearly conforms in 
character, and falling back upon the results of such related 
experience. 

If we have to deal with a large experience, a somewhat 
similar difficulty arises at the commencement and end of the 
table. Generally speaking, we then derive more trustworthy 
values for the rates at these ages from a consideration of the 
general “trend of the curve and our previous approximate 
knowledge of its character, than by falling back upon any 
related experience. 


Coming to the principles underlying each of these three 
methods of graduation, we consider first the graphic method, 
whether in the form employed by Milne or in the preferable 
form employed by Dr. Sprague. This method makes no 
further assumption than that the series with which we 
are dealing would, if the observations were sufficiently 
extensive, form a continuous and regular curve, and that the 
irregularities actually occurring in the ungraduated values 
are due to the smallness of the data. 

To Dr. Sprague (J.J.A., vol. xxvi, p. 77) we owe the 
most systematic and satisfactory exposition of the graphic 
method. An essential feature in his procedure is the 
preliminary division of the data (which we may suppose 
arranged by years of age) into groups, so selected as to afford 
a steady progression in the average rates of mortality for 
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successive groups, due regard being had to the range of 
these groups. For examples of the method, he student must 
be referred to Dr. Sprague’s original papers. This process of 
dividing the data into selected groups appears at first sight - be 
arbitrary, but it may be justified on the grounds (1) That 
in a series of observations such as we are discussing, where 
at each age the results are affected by irregularities or errors 
of observation, a successful graduation will reduce the sum 
of these errors and also the sum of the “accumulated ” errors 
to zero, or nearly so. Hence if we compute at each age the 
accumulated errors (reckoning from either end of the series) 
these must, in order that their sum may be approximately 
zero, change sign, thus passing through zero, fairly frequently. 
The data will, therefore, be made up of consecutive groups, 
larger or smaller, in each of which there is an approximate 
balance of errors, and it may be assumed that, with a suflicient 
amount of experience and the exercise of some trouble, these 
groups can be found by inspection and trial. (2) In further 
justification of this procedure, it is to be noted that the rates 
of mortality deduced from the average rates in the selected 
groups are used as a first approximation only, the final rates 
being arrived at by repeated comparison of the graduated 
deaths with the actual numbers until a sufficiently” smooth 
curve and a sufficiently close agreement has been obtained. 
At the same time I am not convinced that the use of these 
specially selected groups has any real advantage over the use 
of groups of constant range, as quinquennial or decennial, 
provided the operator recognizes that he cannot look for an 
absolute balance of errors in these latter, but must regard 
them as equally subject to errors of observation with the 
numbers at individual ages. 

_ Assuming ‘it to be practicable to draw a sufficiently 
smooth curve, free from sudden changes of curvature, and 
yet representing the observations sufficiently closely with a 
due regard to their weight in different parts of the table, 
there would appear to be nothing to object to in the principle 
of -the graphic method of graduation. In practice, however, 
there are: certain difficulties. The first, particularly in the 
case of a mortality table, is the question of scale. Anyone 
who has attempted to make graphic graduations will, I think, 
rn a dic es practical difficulty. Whether we graduate 

parately the “ Exposed to Risk? and “Died ”, or whether 
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we graduate a function such as g,, the difficulty equally. 
arises. The values of gq, may range in practice from about 
005 to, say, about -5, and at the older ages increase so. 
rapidly that the eye does not readily grasp the nature of the 
curve. In order that it may do so, and that the curve may 
be drawn and read off with sufficient accuracy, a certain 
proportion must be maintained between the horizontal and 
the perpendicular scale, so that the curve shall not cut the 
ordinates at too acute an angle. It is also necessary to 
represent the values of gz in two or three sections, as the 
scale suitable to the older ages will not permit of the values 
at the younger ages being represented with sufficient 
accuracy. : 

Instead of operating on the rates of mortality, we may 
with advantage employ the logarithms of the rates, or the 
logarithms of the central death rates.* We thus obtain a 
curve which is much more easily dealt with. From the fact 
that the rates of mortality change slowly at the younger 
ages, and at the older ages generally approximate to a 
geometrical progression, the logarithms of the rates are 
nearly in the form of an arithmetical progression, and are 
represented by a line having very little curvature. At the 
oldest afes, indeed, it may very conveniently be taken as a 
straight line. : 

Perhaps the main difficulty in graphic graduation is that 
it is by no means easy, even with mechanical aids, to draw a 
sufficiently smooth curve. The curve as drawn may appear 
to be smooth, but on reading it off and examining the series 
of values obtained, we find irregularities which, in order to 
produce a satisfactory graduation, must be removed by a 
further adjustment. If we are dealing with a relatively 
small experience—in which cases these practical difficulties 
are correspondingly increased—they may be overcome to a 
large extent by using as a base linea well-graduated standard 
table representing an experience of similar character. By 
computing the “ expected ” deaths according to the standard 
table, and dealing with the ratio of the actual to the 
“expected” deaths in successive age groups, we avoid the 
difficulties due to inequality of scale and to the rapid increase 
in the value of the ordinates at the extreme ages“ The curve 


® See, however, Note B, p. 114, as to precautions in dealing with logs of rates 
1 ‘ 1 . . ‘ 
of mortality and similar functions. 
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of ratios, apart from accidental fluctuations, will often be 
found to approximate to a straight line, the departures from 
which can be, of course, represented on a relatively large 
scale. In particular, the difficulty arising from the paucity 
of observations at either end of the table will be avoided by 
making each extremity of the curve of ratios terminate in 
a straight line, the locus of which will depend upon the 
general trend of the curve in the neighbourhood. The 
resulting values at the extremes of the table obtained in 
this way will be more trustworthy than those obtained 
without the aid of the standard base line.* 


In finite difference or interpolation methods of graduation 
(of which we may take Woolhouse’s as the best known type) 
the underlying assumption is virtually the same as in the 
graphic method, viz., thatthe curve is of such a nature that 
the ordinary methods of interpolation can be applied. Put 
more precisely, Woolhouse’s method assumes that for a range 
of 15 consecutive ages the values of /, can be represented 
with sufficient accuracy by a curve of the third order, 7.c., 
lete=lzt+at+ bé#+ct3 when ¢ is not numerically>7. As this 
assumes the fourth and higher differences of 1, to be*zero, we 
may write 


’ t! 
= 158 {8 (le_7+ L247) —2 (lee + begs) + 3B(bzat less) 
+7 (leg+le4s) +21 (le_ot+leg2) + 24 (Tey + degr) +251, } 


where l’, may be taken as the graduated value of that 
function, the quantities on the right-hand side of the equation 
being the ungraduated values. 

This formula, which is that used by Woolhouse in the 
graduation of the H™ Table, is of course only one of 
numerous possible formule deducible from the above expression 
for lp44. Others may be found resulting in a smoother 
graduated series, but all the formule since proposed as 
improvements on his are based upon the same goneral 
principle. An indefinite number of such formule can be 
found, even when the range is fixed.t In particular may be 

* See Lidstone JIA 

,U 1.A., xxx, p. 212. The: 


to graduation by a finite difference formula 
J See Todhunter, J.1,A., xxxii, 


se remarks are equally applicable 
(see J.I.A., vol. xli, p. 89). 
378 ; G. . Hardy, J.I.A., xxxii, 371. 
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mentioned Mr. J. A. Higham’s, Dr. Karup’s, and that used 
by Mr. J. Spencer in the graduation of the “ Manchester 
Unity ”* mortality experience. See the following table 
showing the value of u’, in terms of the ungraduated w’s :— 


Taste VY. 


Showing the values of d:, where w'o=Siiuex dz, by various well- 
known Graduation Formule. 


Distance ii 
from Central Spencer . 
se eda Karup Higham Woolhouse 


ny) "172 200 -200 200 
+1 “163 “182 ‘192 192 
+2 135 139 144 168 
+3 -095 085 “080 056 
+ 4 “052 034 “O24 024 
+ 5 017 000 ‘000 000 
+ 6 —-005 —-013 —-016 —-016 
+ 7 —-015 —-014 —-016 —-024 
+8 —015 —-010 —-008 -000 
+9 —-009 —-003 000 : 
+10 —"003 ‘000 
+11 -000 : 

> ee en ee eee 


It is clear that no such formula will entirely remove 
the irregularities in the series, and in Woolhouse’s graduation 
of the H™ Table the outstanding irregularities were removed 
by an empirical process similar to that employed for the 
graduation of the 17 Offices’ Table, and described in his 
paper (J.[.A., vol. xii, p. 140-1). The object aimed at in 
a formula such as these, should be so to select the coefficients 
of the terms on the right hand that, while giving an 
expression for the value of the central function correct as far 
as the order of differences employed, the formula will 
produce the maximum smoothness in the flow of the 
graduated values. ‘This may be done by simple experiment, 
or we may adopt some empirical measure or standard of 
smoothness and thereby compute the most advantageous 
cocficients. We may, for example, adopt as our standard 
of smoothness the extent to which the second differences 
of our graduated function are affected by tHe errors of 
observation in the original table. 

Applying this standard to Woolhouse’s formula, we have 
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for the graduated second central difference of J, (using 
f S — 
central differences for the sake of symmeti y) 


125A, = —3ly_gt Al, 7+ lr—6 +l, 5. + l ait Wis 
=11ps—2le-1 — 21 — Beg — Pitees 
+ 10le43+ legat leas = Loget dlea5— Blesas 


If we assume that on the average each of the ungraduated 
values of J, on the right-hand side of this equation is subject 
to a mean error of +e, and if we assume that these errors 
may be combined according to the normal law, then the mean 
error of the entire expression for 0 | eee will be found by 
multiplying e by the square root of the sum of the squares of 
the coefficients, giving : 


(/BFPPTSTSES Be) ._ V510 ,_ag,, 
“MO i eee et 125 

In the same way it may be shown that in Karup’s formula 
the mean error in A%u,_,; is about ‘068e, where e is the 
mean error of a single value of wz. It must not be supposed 
from these results that the mean errors in the graduated 
values of J, or w, are proportionately reduced. The mean 
errors in the graduated functions when Woolhouse’s formula 
is employed are reduced to about 42 of the mean errors in 
the ungraduated functions, or are about equivalent to the 
mean errors of the ungraduated values corresponding to an 
experience 54 times larger. The graduated table based on 
the smaller data would, however, be smovther than the 
ungraduated table based upon the larger data. (See J.DLA., 
xxxii, pp. 876-7.) 

Taking a generalized formula, such as 


W p= Atle + O(a + Ung) + C(Mero+ teas) + WC... h(t yet tty ce) 


where w’; represents the graduated value of 1; » und 
assuming that each of the ungraduatcd values Ue, &C., 
are affected by the same mean error te, it is of course 
possible to determine the values of a, b, ¢, &e., so that tho 
mean error ts, say, A®w’,_, shall be a minimuin. Noting that 
a=1—2b—2c—&c., and that b+ 4¢ +.9d4 &c. = 0, in order that 
the formula may be correct to 3rd differences, an expression 
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may be found for A*uw’,_, in terms of wy_y-1, z-#, &c., with 
coefficients involving c,d,...k. If the coefficients of each 
term are now equated to zero, there will be (2#+3) equations 
of condition with (¢—1) unknowns, which may be solved 
by the usual method of least squares. 

This is somewhat theoretical, however, as the values we 
should obtain for the coefficients would be generally 
fractional, and the resulting graduation formula would not 
lend itself to any continuous method of computation, as is 
the case with Woolhouse’s and other similar formule. An 
alternative would be to fix upon a convenient set of 
summations, and then to determine the function summed 
(called by Mr. Lidstone the “ operand”) so that (1) first and 
second differences may vanish—see J.J.A., xxxii, 371, &c.; 
(2) The range of the formula may be what we require; 
and (3) that subject to (1) and (2) the coefficients shall be 
such as to make the mean error in A? or A? a minimum. 
This might give a fairly convenient working formula, as 
when once the operand was formed the ordinary convenient 
method of summation would apply. 

If we consider the effect of such a formula of graduation 
upon the outstanding or unbalanced errors of observation in 
a smalP group of ages, we shall see that they are not very 
materially diminished. If, for example, we express the sum 
of five consecutive graduated values in terms of the 
ungraduated values, we shall have, in the case of Woolhouse’s 
formula, 


ev 
Veet Ue tet Vegi tVeu= Ds (80lz_2+1012,_,. 


47151-41011, 77 801,.5) 
+terms involving other values of l. 


Here it is obvious that any systematic or unbalanced error in 
the original group will not be greatly reduced (probably 
to about three-fourths of its amount) in the graduated table. 
While, therefore, finite difference formule of graduation 
yield, generally, a smooth curve as regards the progression of 
the graduated values from age to age, they have a tendency 
to reproduce any waviness in the origina due to the 
unbalanced errors affecting small groups of four or five 
consecutive ages. 
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A question arises in connection with this es they 
what particular function should be selected for Ee ua i. a. 
In the case of Woolhouse’s original formula the oe oe 
operated upon was Lee» Practically speaking, ae ha - 
latter portion of the table, this approximates in result to w 
oraduation of the rates of mortality. This may be seen from 
the following relations. Any adjustment of the 1, column by 
a finite difference formula has, of course, the same effect as 
a similar graduation of the dy column. Since de=lide, and 
since for the range of ages included in the formula (fifteen 
in Woolhouse’s formula, of which, however, only the five 
central ages are heavily weighted) the values of 1, are not 
in general widely different, the graduation of the [sored 
column should give results not materially different from those 
obtained by graduating gz. At the older ages, however, 
there may be significant differences in the results, and T must 
express my preference for the rate of mortality as the more 
suitable function to graduate if the observations are duly 
weighted or if proper precautions are taken to avoid 
anomalous results at either end of the table where data are 
scanty. 

An objection to the principle of the finite difference 
methods of graduation is that the weight of the observations 
is not allowed for at various ages. This objection is not very 
serious, however, as at the commencement and end of the 
table, where it would be chiefly felt, the method is usually not 
strictly applied. It may be noted that if the J, function be 
graduated, then its rapid decrease in value at the oldest ages 
in the table gives automatically a diminishing weight to the 
observations with increasing age, but at the same time yields 
somewhat irregular graduated values. The objection may, of 
course, be got rid of by first applying a smooth series of 
weights to the function to be graduated, prior to graduation, 
and eliminating these factors afterwards. 

A difficulty arises in the use of finite difference formule 
from the smallness of the data at the extremes of the table 
and from the fact that the first 7 or 8 values of the 
graduated function cannot be obtained from the formula. In 
the case of a mortality table there is not so much difficulty 


in dealing .with extreme old age, because there, as 


Woolhouse points out, if we are dealing with the function J, it 
may be taken 


=0 beyond the limiting age of the table, or if 
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we are graduating the rate of mortality, gz may be put down 
as equal to unity. As regards the earlier ages, Woolhouse’s 
method is to obtain from the formula the graduated values 
of 1, so far as this can be done, that is, to within 7 years of 
the initial age, and to compute the values for the first seven 
ages of the table from the values of h, L, le and 1, 
(, representing the value of J, for the initial age) on the 
assumption of a constant third difference. This method may 
in certain cases lead to anomalous results, even negative 
rates of mortality. Mr. Ackland has given an alternative 
method of considerable ingenuity (J.I.A., vol. xxiii, p. 357). The 
difficulty may be avoided by assuming values for the initial 
ages, as, for example, a constant average value of g- or d;, or 
other arbitrary values deducible from the general character 
of the experience. A more satisfactory method would be to 
determine gz, for the first 10 or 15 ages, by the method of 
moments or least squares, on the assumption that it could 
be represented by a first or second difference function. All 
these methods, however, are expedients more or less 
empirical, though they may in practice lead to sufficiently 
satisfactory results. 

The Finite Difference methods of graduation all 
assume that the functions to be graduated may be repre- 
sented for successive small tracts of ages by a parabolic curve 
of the form— 


Uzr=atbaet+ca?+ &e. 


We are not bound to assume this particular form of 
function. We can employ the principle of the Interpolation 
method, representing our function by some other form, 
as, for example, m,=a+bc* corresponding to Makeham’s 
formula. 

The principle of the methods of graduation we have been 
discussing, of which Woolhouse’s is a type, must not be 
confounded with that used by Davies in graduating the 
Equitable experience, nor with that used by Mr. Berridge 
in graduating the Peerage mortality. These latter are more 
nearly allied to graduation by frequency curves than to 
Woolhouse’s method. In Davies’ Equitable graduation, 
curves of the third order are actually fitted successive 
sections of the J, column, the values of 1, from 10 to 40 being 
virtually found by a third difference interpolation. from the 
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values Jy, lo, I, 40, those from ly to ly similarly from the 
valued of Ligh Uy tgp Gop Bue sO ons LIE Berridge’s graduation 
of the Peerage mortality followed a similar principle, except 
that he represented the entire series of values of log 1, from 
15 to 75 by means of a'single curve of the sixth order, based 
upon the values of that function for decennial intervals of age. 

' Ag to the relative merits of graphic and finite difference 
methods of graduation, the former has an undoubted advantage 
when the number of facts at our disposal are few. In these 
cases formule of the type of Woolhouse’s cannot be expected 
to produce very satisfactory results, as in the comparatively 
small section of the curve embraced by the formula the true 
character of the curve will frequently be obscured by the 
errors of observation. These formule are at their best when 
applied to a table based upon fairly extensive data, and 
presenting a curve without any rapid change of character. 
The advantages possessed by the graphic method in dealing 
with a small experience, owing to its flexibility and its power 
of bringing under contribution large sections of the curve at 
once, are, however, still more noticeable when frequency 
curves can be suitably employed. 


: be} 

We have already spoken of the success or sufficiency of a 
graduation, but we have not said anything as to what is the 
proper test of a successful graduation. Before dealing with 
the general principle of graduation by means of frequency 
curves, it will be useful to consider this question. There 
are obviously two. conditions that should be fulfilled by a 
graduation. In the first place, a smooth and continuous 
progression in the graduated values. This is required because 
we have good reason for believing that if the true values were 
ascertainable, they would exhibit this property. In the 
second place we require an adherence to the original data 
sufficiently close to be fairly within what we may conveniently 
term the errors of observation. . 

‘The standard of smoothness is not easy to define. If « 
fone is pee representing the ultimate values of 
‘2s Yu, OF fx AS @ Tunction of the age, this in itself secures 
a smooth semies.. In other cages the ‘guicieney or Ba 
of the graduation in this respect must be left to individual 
judgment... The advantages of a really smooth curve are 
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mainly found where it is necessary to resort to interpolation 
or to the use of summation formule; and, further, in the 
practical consideration that with a really smooth curve nearly 
all tables calculated therefrom can be sufficiently checked by 
differencing. 

As regards the second requirement, that of adherence to 
the general features of the ungraduated experience, it is 
easier to set up a criterion. We have already seen that if 
the true value of the probability of an event happening at a 
single trial is p, the event will, on the average, happen np 
timés in n trials, and if there are series of 1, 1, 1s, &c., 
trials in which the probabilities of the respective events are 
Pi, Pe, Ps, &e., then on the average the total number of 
occurrences in such a series of trials will be 1p,+%epe+ 
MPs+, &c. That is to say, if the observed occurrences 
are 0, 6, 0;, &c., then the average value of each term 
(A—mp.), (G2—Nepe), &e., and consequently of the sum of 
such terms, will be zero.* It is also obvious that the average 
value of the sum of the series (6;—19,)+2 (82.—Mope) + 
3(6;—Nsp3) +, &e., and generally of the series whose rth 
_ term is : 

ir 
oe (8,,— pr) 


will be zero. In the case of a mortality experience these 
quantities (6,—mp,), &c., represent the deviations of the 
observed deaths at each age from the “ Expected Deaths”, 
as computed by the true rates of mortality, supposing these 
to be known. It follows, therefore, that we should expect 
the total of such deviations on the average to be zero, and 
in the same way the average value of the successive sums 
of the accumulated deviations should be zero. Generally, 
if we put 
In=NM+mM+N+ 4+, &e. 
TSn=Sn=mM+2m~+3ngt+ , Ke. 
TES = DS3n=m + 3n3g+ 644+, &e. ; 
we shall have on the average 
>t (0,.— NyPy) = 0. 


* This is not the most probable value of these terms, altlysugh in general 
it will be very close thereto. The Actuary, however, requires to consider 
the average result, not the most probable. 
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We should not expect (assuming the true values of p, to 
be known) that these sums of the deviations of the actual 
from the expected numbers would actually be equal to zero 
in any given case, but we should expect in a long series of 
cases that the positive values would approximately balance 
the negative. We do not expect to obtain exactly 1,000 
heads in a series of 2,000 tossings of a coin, but we should 
expect to find that the average number of heads over a great 
number of such series of tossings would be very close to that 
figure. This reasoning leads us to the conclusion that, 
given a successful graduation, we should not only have 
obtained a smooth series, but that the sum of the deviations 
between the computed events (deaths or otherwise) and 
the observed numbers, would be nearly zero, and that 
the successive sums of the accumulated deviations would 
be small. 

It is not necessary in practice that this test should be 
pushed too far. We may be satisfied if the sum of the 
deviations and the sum of the accumulated deviations are 
practically zero ; if the total deviations in successive sections 
of the table (¢.g.,in quinquennial or decennial groups) appear 
to be, on the whole, within the limits of the errors of 
observation ; and if the total of the accumulated deviations 
changes sign fairly frequently. On the other hand we should 
expect that the total deviations irrespective of sign should 
not be materially less than their theoretical amount. 
‘Otherwise we should conclude that the series was under- 
adjusted and that accidental fluctuations in the curve had 
been incorporated as inherent characteristics. 

These tests of a graduation are well known to Actuaries. 
and, indeed, have been very generally employed by them. 
So far as they go, they correspond to the method of moments 
which Prof, Karl Pearson has elaborated and employed with 
such success in the fitting of frequency curves to statistical 
data, It is clear, however, that they can only be employed 
systematically in conjunction with those or other curves 
capable of analytical expression, Using methods of gradua- 
tion, based upon Finite Difference formule, such as 
Woolhouse S, we cannot secure that the successive sums of 
a nieve shall vanish, though in general we may expect 

em to be small. Using the graphic method, we can, by a 
gradual process of hand-polishing the curve, reduce the 
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accumulated deviations and their sum to as small a value as 
we please,* but the process is a tedious one. 

A second test that has occasionally been applied when the 
graduation has been effected by means of a formula, is that of 
making the sums of the squares of the deviations a minimum 
the deviations being either in respect of the graduated ad 
observed deaths at each age or those of the graduated and 
ungraduated values of some function such as J, or log. Jp, 
This method, known as the method of “Least Squares”, is 
used very generally in connection with measurements in 
astronomy and other physical sciences and has given rise to 
a quite extensive literature. It is based upon the assumption 
that if in a given series of observations the relative frequency 
of an error 2 at each observation is represented by the 
function ke-*/*, then the probability of a conjunction of any 


* Tt may, perhaps, be worth pointing out that if we have obtained a 
smooth curve with a general conformity to the original facts, but not making the 
= (deviations) or 3* (deviations) vanish, this may be done by the following plan. 
Assume, for the sake of illustration, that the function graduated is the central 
death rate mz. Representing by mz the graduated values of that function,by 
E, the “Exposed to Risk” in the middle of the year of age and by 67 the 
observed deaths, let 

= 3(m,Ez—6,)=A 


=*(m,Ez—9z)=B 
then, if m’,=a+(1+b)m, be the modified rates required, o 
a.%(Ez) +b2(Eynz)=—-A 
a. °(E,) + b2°(Ezmz)= —B 


whence @ and 3 are determined. 

If the table on the whole follows Makeham’s law the use of this form of 

- correction enables us to neglect all orders of differences in the preliminary 

adjustment of mz or pz. Formule may thus be employed (as for example, a 
simple double summation in groups of 10 values, or, still better, successive 
summations in 10’s, 5’s and 2’s) giving a much smoother curve than when 
account has to be taken of second differences, the resulting systematic error of 
this first graduation being corrected as above. 

In the alternative, if m’z=m,+a+bz, 


a3(E,) + b22(E,)=—-A 


QzE,) + b2%x(E.)= —B. ‘ 


This method may be employed in conjunction with Mr, Lidstone’s plan of using 
a standard table as a bas? line for purposes of graduation, 
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set of errors 24, 2, #3, &c., will be proportional to the value 
of the product 


wie getgh paca Ge 
et Cars. bate, &e., 
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which clearly has a maximum value when the index of e is 
numerically a minimum, 7.e., when the sum of the squares 
of the errors (aj-+a?+a?+é&c.) is the least possible. ‘This 
expression assumes that the average error, and therefore the 
probability of a unit error, in each observation is the same, 
an assumption which may often be fairly made in respect to 
independent measurements of a physical quantity. If the 
observations are not of the same weight, so that the 
probability of the errors of 2, a, #3, &e., im the respective 
measures are 


—rzia2. pove'l? . p-adic? . 
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then the most probable solution will evidently be that which 
makes the sum of these exponents the least possible.* 

The assumptions upon which this method is based are not 
strictly in accord with the conditions of a mortality experience 
or similar statistical observation. If the method is applied to 
the deviations between the observed and graduated deaths, 
the objection may be raised that the observations at different 
ages are not of equal weight, and that the probability of a 
unit error varies at each successive age, while in cach case 
the probability of a given error can only be approximately 
expressed by the normal function ke-"**, positive and 
negative errors not being equally probable. It is, of course, 
possible suitably to weight the observations, so that a 
unit error is made equally probable. For example, if at 
any given age there are n “exposures”, and if the true 
probability of death is g, then the “standard deviation” or 
J average square deviation qa —q), and the probability 
of a difference of « between the expected and observed 
deaths is approximately xe~*"/ml-« ; the error in the formula 
when x is positive nearly compensating the error when 2 ig 
negative, Hence, if the Exposed to Risk” and “ Died” at 
each age are multiplied by the factor [nq(1—g)]-*, where q 


* See Note G, p. 117. 
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ts to be taken at its true or graduated value,* then the 
observations may be considered to be properly weighted for 
the application of the method of least squares. 

We shall see in the following lectures that there is an 
intimate relation between the criteria of least squares and 
moments. This will be better discussed after considering the 
question of frequency curves and the process of fitting them 
to a set of statistical observations. 


* The ungraduated values of ¢ cannot be used, as this would result in undue 
weight being given at all ages where the observed mortality was in excess of the 
average, and insufficient weight where it was in defect. Consequently, the 
mortality table resulting from this process would on the whole overestimate the 
mortality throughout. In other words, the use of the unadjusted values of ¢ 
introduces a systematic or “biassed” error into the calculations. If this is 
avoided, however, a very rough approximation to the graduated curve of q will 
give weiglits sufficiently near the truth for practical purposes, as a slight change 
in the relative weights of a given series of observations produces )ut little result 
upon the final solution. 


THIRD LECTURE. 


I PROPOSE in the present lecture to consider generally the 
use of frequency curves in relation to actuarial statistics. 
We have seen that the graphic method of dealing with these 
statistics, as also methods based upon finite difference formule, 
assume only that the true law of the series, if known, would 
be found to be represented by a continuous curve amenable to 
the ordinary processes of interpolation. It is often possible, 
however, to see that the ungraduated series can be well 
represented by a curve of a certain distinct character, and 
when this is found to be the case more satisfactory results are 
obtained, particularly where the data are few, by fitting to the 
original series a curve corresponding to iis observtd genoral 
character, so determining the constants in the equation of the 
curve as to secure the closest agreement with the ungraduated 
curve. If for example we turn to the series in column (2) of 
Table I, it will be at once seen that the general character of the 
series accords very closely to the “normal” frequency curve, 
or to some curve having the same general features. When 
we find that, by giving suitable values to the constants, a 
frequency curve can be made to fit the observations within 
the limits of the errors of observation we may be satisfied that 
the graduated curve thus produced is probably a better 
representation of the original than any that would result from 
a graphic or finite difference method of graduation. 

Any curve which exhibits the law of variation in a 
particular function, such as a table of I, d» or Kx, may be 
considered for our purpose as a frequency curve. The 
expression is usually, however, confined to that class of curves 
which experience seems to show to be specially applicable to 
the observed distributions of deviations from mean values in 
statistical tables. We have already seen examples of such 
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tables where the frequency of the deviations of measures from 
their mean value follows certain comparatively simple laws. 
Professor Karl Pearson has examined a considerable variety 
of statistical data (mainly, but not entirely, biological) and 
finds that in practically all the cases examined the distribution 
of the various measurements may be represented fairly closely 
by one or other of the class of curves derived from the 
differential equation 

1 dy ba—2a? 

Dida, bee aay eee Gs (1) 


where # represents the magnitude of a given deviation 
from the mean of a series of measures and y the frequency 
of such deviation. 

As this group of curves is of considerable importance, 
though less so perhaps in relation to actuarial than in relation 
to some other classes of statistics, it is convenient to consider 
them first. It is not necessary here to discuss these 
curves analytically; the student may be referred to the 
original papers of Professor Karl Pearson*, or to an 
admirably condensed résumé by Mr. Robert Henderson in 
the Journal of the Actuarial Society of America, reprinted 
J.I_A., xi#, 429-442; and to Mr. W. Palin Elderton’s treatise on 
“ Frequency Curves and Correlation” in which Professor 
Pearson’s methods are fully described. The table at the end 
of these lectures, which gives a sufficiently complete summary 
of such of the algebraical properties of these curves as 
are most useful in practice,.is, with some unimportant 
modifications, based upon that given by Mr. Henderson in 
his paper. It will be sufficient for our present purpose 
to give a brief general description of these curves and of 
_ their use in connection with actuarial data. 

We have already seen that the general character of curves, 
such as those of Tables I and II, is approximately determined 
by the average value of the squares and cubes of the 
deviations of the variable from its mean value; the former 
giving a measure of the compactness or diffuseness of the 
curve that is of the average extent of the deviations from the 
mean irrespective of their direction; the latter a measure of 
their departure from symmetry, or of the “ skewypess ”, of the 
curve. It will be useful at this point somewhat to extend 


* Phil. Trana., vol. 186, p. 343; vol. 197, p. 443, &e. 
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this general statement, and, betore proceeding toa description 
of particular curves, to explain more in detail what is 
meant by the “moments” of a curve. 

If we suppose y =f (x) to represent the equation to a 
given curve, # varying between the limits A and &, the 
total area of the curve will be represented by the 
expression : 


te 
area, =| yd. 
k 


We may suppose, for instance, to give definiteness to our 
ideas, that the function y represents the numbers under 
observation between age 2 and #+dux, the number of “years 
of life”? observed between these ages being yd, and the arca 
of the curve, the sum of all these quantities, being the total 
years of life observed at all ages. If we now multiply 
each value of ydz by the corresponding age e and divide the 
total of these products by the total number of the “ exposed”, 
we shall have the average age of the whole. Put into 
symbols : 


Te k 
| oyde- | yde=average value ofw. . . . (2 
k h 
=I1st moment of the ecunwe round 
the ordinate for which #=0. 


rms =n, ) say. 
Similarly, 


hh A. 
[ary-de+| y .dz=average value of .” 
é k 


=nth moment round ordinate for 
which w=0. 


= ha. 


The moments of the curve may be taken round any 
ordinate we please. If, for example, the average value 
of @ as found by equation (2), is «, then the ordinate 
corresponding to this value of « passes through the centre 
of gravity of the curve, and is termed the “centroid vertical.” 
In general it is most convenient to take the value of the 
moments of ‘the curve round this centroid vertical, for which 
obviously the first moment vanishes. The expression for the 
nth moment round this ordinate then becomes : | 


h le ie 
[e—aiydes[yde pn. rem aie es. <2: 
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the average value of the nth power of the deviations (a—2,) 
between the values of x and the mean value. When the 
moments of a curve are spoken of without qualification, 
it will be understood that they are the moments round 
the “centroid vertical.” These moments are, of course, 
those already referred to in Lecture I., p. 7, as representing 
the sums of the powers of the deviations of x from its 
mean value. 

Thefollowing formule, which may be readily demonstrated,* 
connect the values of the moments round the “centroid 
vertical” with the moments round the ordinate for which 
#x=0. Using the same notation as above, we have 


peg 1 
ea 
H2=Ma— (m))? «Be A) 


fg = M3 — 3myMe + 2(m)3 


M4=M4— 41M + 6(1,)?mMg— 3 (1) 4 


where the law of the coefficients is sufficiently obvious. 

For the particular family of curves arising from the 
differential equation (1) formule may readily be found 
for the moments involving the various constants of the 
curves, and inversely, the values of the constants can be 
expressed in terms of the moments. ‘The formule for 
the higher moments being sometimes complicated, it 
is more convenient to tabulate certain functions of the 
.moments, ¢g. : 


ae)". pb tee, eis 
P= Ga) P= a) Y= Ba 3 


from which the constants of the curves may be obtained more 
readily, which are also useful in discriminating between the 
curves applicable to a given set of observations.” 


* See Elderton, p. 17-19; Henderson, J.2.4., xli, 431-2. 
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The various curves arising from the differential equation 
(1) may, for our present purpose, be conveniently classified 
as under :— Sos 

Class I. Symmetrical curves. Range limited. 
ae ’ ms » unlimited. 
», Il]. Skew curves. Range limited in both 


P) 


directions ; 

hee » lV. Skew curves. Range limited in one 
direction ; 

» Vv. Skew curves. Range unlimited in either 
direction ; 


the various types of curve being as follow. It will be seen 
that some of these Classes are repesented only by a single 
type of curve: 

Class I. Symmetrical curves. of limited range—In this 
class we have only the single curve. 


Type 1. y=e(1— 5) 


The values of # range from +a to —a, for either of 
which values of the variable y becomes zero. 

The average value of x is obviously zero, the corresponding 
ordinate y is a maximum, and clearly bisects the area @nclosed 
between the curve and the axis of x. In other words, the 
“mean”, “mode”, and “median” of the curve all coincide, 
as in all symmetrical curves. 

The second moment of the curve 


a? 

oe +3 
and the “standard deviation” 

i a 

~~ Sim+3 
The fourth moment 

8a? 
= on rt ag 


The value of m will usually be positive when y equals zero at 
both limits. Ifm>Q<1 the curve cuts the base-line at an 
angle. If m is negative the value of y becomes infinite at 
both limits, aad m is always > —1. 
This curve has a close relationship with the symmetrical 
point binomial curve, whose terms are proportional to the 
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terms in the expansion of (+4), the general term of which 
may be written | 


y= ug 
\n a 


[It will, of course, be understood that the x’s in these 
formule, and in others, are not identical, but 
simply stand for some constant in each case, the 
numerical value of which is determined by the 
area of the curve. ] 

The binomial curve, however, can be conveniently used 
only to ee the definite points corresponding to integral 


values of 5 5 £2, whereas Type 1 represents a continuous 


curve (N sie D, p. 122). The data with which an actuary has 
to deal are generally in the latter form, for example, the 
numbers living, the number of deaths, withdrawals, &c., 
between the ages « and #+1, and although usually the 
number of terms in the series is so considerable that the 
curve may be treated as a series of points, on the other hand, 
a binomial having so many terms will not generally be found 
a suitable curve to employ. In most instances where a series 
can be fairly represented by the symmetrical binomial, it can 
also be” fairly represented by Type 1, with possibly some 
slight difference in range, as will be seen later. 

There are other symmetrical curves of limited range, 
which are in the nature of frequency curves, but which do 
not belong to the family of curves derived from equation 
(1): such, e.g., as the curve ‘e 

y=xe CHa 
which, however, we need not discuss here. 

Class II. Symmetrical curves of unlimited range.—In this 
class are two curves belonging to the family with which we 


are dealing. 
Type 2. Wet ker NTO, BLO HA Pa e8h Des (5) 
This is the curve of “facility of error”, or the “normal ” 


frequency curve. 
The average value of « is clearly zero, corresponding to 
the “mode” or the maximum value of y, and to the median. 


The second moment =.= e, and the standard 


deviation= —~ 


Fi: 
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Type 1 evidently transposes into this curve when the 
value of a, and hence the range of the curve, is made 


a? i? ae : 
indefinitely great. If we put — =c, making both a? and m 


indefinitely great, but their ratio finite, we have 
2x TH b 
Limit (1-5) = ee hen 
a 


Even when the range of the curve is not great, that is 
when m and a?are not large numbers, there is a fairly close 
agreement between curves of Types 1 and 2and the symmetrical 
binomial. 

This may be seen by a numerical example, the following 


table showing Ras 


1. The values of y= Brep—a for integral values of 


z, these values being proportionate to the terms in 


: ee faa Ll hale 
the expansion of the binomial ( + 3) : 
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2. The values of y=998(1—9,) 
3. The values of y=1,026e—-™"'s», 


the constants in the two latter curves being choseir to give 


as good general agreement as practicable with the binomial 
curve. 


Tanre VI. 
Showing Similarity of Types 1 and 2 to the Symmetrical Point 
Binomial. 
Binomial curve T J 
Yanable, aero es Type 2 
_ _36000 even . 
2 o” \S+as—2 y993(1— =i y=1026e7""!a 
a‘ : Dini wheat 
=3 a 47 56 
—2 300 303 283 
“0 he 702 743 
0 1,000 993 ae 
: a50 752 743 
2 300 303 282 
3 50 47 “86 
4 0 5 
= 6 
| eee a 
Totals 3.200 
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Had the range of the curves been greater, the binomial 
being taken to a higher power, and the values of the 
constants a? and m in col. (8) and of c* in col. (4) been 
larger, the agreement of the three curves would have been 
correspondingly closer. As it is, the two first curves are 
very nearly identical, while the “normal” curve, although 
theoretically of unlimited range, is fairly close to the 
binomial, the terms corresponding to values of 2 numerically 
greater than 4, amounting to less than 1 in the aggregate. 
It will be noticed that the values of y in the limited curves 
necessarily diminish more rapidly as the limiting values of a 
are approached, while the normal curve is less flat in the 
centre. 


Type 3. y=e(1+ 5)”. SER Ca 2 CB) 


This curve, which is also symmetrical and unlimited in 
range, diverges from the normal curve in a direction opposite 
to Type 1, the values of y diminishing, when z is large, more 
slowly than in the normal curve. The curve transposes into 


the latter (Type 2) when a? and m are indefinitely large, = =e 


being, however, finite. We then have 
. ~ ge \=* tbe 
Lt. (1 + =) sr cents. 


The average value of w in the curve y=x«(a?+2*)-™ is 
zero, corresponding again to the “mode”; the second 


moment = p2= 5 and the “standard deviation” 


2 


3 De ote 
= Tact The fourth moment =y4= 55 bs and, it is 


rs 
clear, becomes infinite unless m> 5° Indeed, the higher 
moments of the curve must become infinite whatever be the 
value of m. 

The classes of symmetrical curves are of somewhat limited 
application to actuarial statistics, although there are certain 
cases in which they represent the observations fairly well. 

Class III. Skew curves. Range limited in both directions.— 
There is only a single curve of this class in the family of 


curves we are considering, namely : ; 


Type 4. y=e(1—2) "(142)". cals 473) 
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The values of a range from —ato +a; the “mode” is 


at a= 2.4, for which value y is a maximum; the 


M+ Me file, : 
is ——-——- :a. The expressions for the 
mean value of @ is Pa aaty p 


moments of the curve are simplified by putting it into the 
form given in the tableon p.140. If we write m=np—l1, and 
mM,=nqy—1 (where p+q=1), the equation to the curve 
(which does not, of course, change in character with this 
transposition) becomes 


yoe(1-2)" "(142)". 


the variable having the same range of values —a to +a, the 


“mode” being at a= (g—p)a; the average value of 


n 
n—2 
z=(q—p)a; the second moment =.= me ‘a, and the 
“standard deviation ” the square root of this quantity. 

When m=m,=m, this Type evidently transposes into 
Type 1, and thence into Type 2 when m is infinite, 

This curve is related to the skew point binomial arising 
from the expansion of (p+q)”, where p and q have 
approximately the same values as in equation (9), and 
where the index of the binomial is not tow small, shere is 
a fair numerical agreement, as may be seen in the following 
table, where the figures given in col. (2) are proportional to 


the terms in the binomial expansion of ie mF 5) — 


Taste VII. 


Showing Numerical Similarity of the Ourve of Type 4 with the 
Skew Binomial. 


Sn) 


Binomial curve | 
Value of Type 4 
Variable 5760 


A I~ Byaboa | Y= K(475—2)97(6-25 +2)1146 
| i aes ee 

(1) (2) (8) 
-2 12 13 
z: 60 61 
0 160 159 
1 240 240 
2 192 194, 
- ; 64 60 
4 0 1 


Totals ... 790 
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It will be seen that for so small a value of n as 6 the 
binomial curve can be closely represented by means of 
selected points in the continuous curve of ''ype 4. When the 
value of » is large, a much closer agreement is obtainable. 

The skew binomial is of importance to the actuary as 
representing the law of the deviations between the actual 
number of events observed in a given series of trials and the 
“expected” number when computed by the true value 
of the probabilities. There are very many statistical 
distributions capable of being well represented by the 
binomial curve if the latter is treated as a continuous curve. 
This procedure is not, however, convenient in practice, as it 
rarely happens that the given ordinates coincide with the 


! 
: 2 fb rr? 
integral values of 2 in the general term leh tl 7 ), and, 
ULL K q 


moreover, the analysis, when the curve is treated as 
continuous, is not very simple. (See Note DP, p. 122. 

The form of curve corresponding to Type -b varies very 
considerably with certain changes in the values of the 
constants m, and m,. In its more usual form, when both 
m, and m, are >1, as in Table VII, the curve bears a 
general resemblance to the ago distribution of the 
“entrant$” in #@ mortality, or similar experience (see 
Table II), also to the numbers of the exposed to risk; to 
the number of marriages, or to the rate of marriage at 
various ages; to the average number of children under age, 
or to the cost of their pensions at the death of the father, a 
function of use in pension fund valuations; to the number 
of retirements in such funds where superannuation occurs 
on invalidity and not at a specified age; to the incidence 
of attacks, or of mortality, from certain diseases, &e. Owing 
to the number of constants involved (as the increment of ve 
may represent any period of time, there are virtually five), 
the curve is very adaptable. 

It will be readily scen that if the values of both am, and 
my in equation (8) are high the curve makes very close 
contact with tho axis of w# at either Jimit; if my or my lies 
between 0 and 1, tho curve meots the axis of wat an angle; 
whereas, if either or both of thom aro negative, the expression 
becomes infinite at one or both limits. ‘The area yt the curve 
and the moments do not, however, become infinite if both a, 
and m2 are greater than —1, 
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Class IV. Skew curves. Range limited am one direction.— 
There are two curves of this class. 
Type 5. “usm eS ee ee ee 


‘which is a limiting form of curve No. 4, the values of 2: 
ranging from 0 and o. 

The “mode” is at e=ma; the mean value of 2 is 
(m+1)a; the second moment (m+1)a?; and the third 
moment 2(m+1)a?; these being sufficient to determine the 
constants. 

In the usual form of the curve, that is when m>1, this 
curve represents fairly well some of the statistical distributions 
represented by curve No.4. Owing to the feature that as a: 
becomes large the successive terms have a tendency to run 
into a geometrical progression, it is not so well suited to such 
distributions as that of the ‘‘ exposed to risk”’, where the effect 
of the rapid rise in the rate of mortality at the older ages 
makes itself felt in an increasingly rapid diminution in the 
values of y. This is somewhat unfortunate, as the curve is a 
simple one, determined by the values of its first three 
moments, and except for the reason stated, well suited for use 
in connection with Makeham’s formula for the force of 
mortality. o ~ 

As in Type 4, the character of this curve may be entirely 
changed by an alteration in the values of the constant m. If 
this constant vanishes the curve becomes a diminishing 
geometrical progression; while for negative values of m the 
curve becomes infinite at the lower limiting value of x. The 
value of m must in any case > —1. : 

The actuary has to deal with several distributions roughly 
similar to a diminishing geometrical progression as, for 
example, the curve of infant mortality, the rate of withdrawal 
in successive policy years, or the difference between the select 
and ultimate mortality rates ina select mortality table. Other 
expressions giving a similar form of curve may be employed to 
represent these distributions as, for example, y=« (a+e-™*), 
with a minimum value of «a when 2 is very large; or 
y=«(z+a)-™, where if a is small we have a curve again 
similar to that of infant mortality, « representing the age. 


i 


Type 6, y=e(E-1)"(E41) ™ | cf k | : : page 4 2) 


where the limiting values of # are a and oo, with an 
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average value of a= —™2t™ ce 39 
bs Ma, —m,—2° 93 the “mode” occurring 
Mo + N; 
at SS ————6. i 
a ee The expressions for the moments are much 


simplified by writing the equation to the curve in the form 
given in the Table on pp. 140-1. 


Type 7. eee ee eee) 


Where z varies between 0 and oo, having an average value of 


a 5 

pape, f with the ‘‘mode” at eed - The second moment 
a? | 

= (m—2)*(m—3) and the “standard deviation” 


a 
at = —__ -: 
consequently GES es: 

Here m must be >3, or the second moment becomes 0, 
and the fourth moment becomes infinite unless m is greater 
than 5. 

Neither this nor the preceding curve are of any wide 
application in actuarial statistics, owing to the fact that the 
values of y for large values of « diminish with increasing 
slowness ; @ feature not often met with in practice except in 
such a function as the “rate of withdrawal.” The same 
remark holds good of the single curve constituting Class V. 


Class V. Skew curves. Range unlimited in either direction. 
2. —m x 
Type 8. y=n(1+ Fan iw etuers cht 


This is the only skew curve of this family having 
bent y 
unlimited range. The average value of a= LY Peas eg the 
“mode” is at 5—a. 
2m 
The expressions for the moments and their functions are 


simplified by writing (5 +1) for m in equation (18), as in 


the Table on pp. 140-1. For the reason stated abové, the curve 
is not specially useful to Actuaries. 
E 
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Assuming that a given statistical series can be represented 
by one or other of the curves above described, the appropriate 
curve can be found by means of certain criteria based upon 
an examination of the “moments” of the curve; that is to 
say, the sums of the powers of the deviations from the mean 
value. These criteria are furnished by the table on pp. 140-1, 
above referred to. 

As the calculation of the criterion is somewhat lengthy, 
it may be noted that if the logarithms of y are tabulated 
for equal intervals of the variable 2, and the values of 
A?logy taken out, these give us information as to the 
nature of the curve. The value of A*logy will be 
constant and negative for the “normal” curve Type 2; 
negative and symmetrical with a minimum numerical value 
in the centre of the range, for Type 1, or for any binomial 
curve; uniformly negative, non-symmetrical, and with a 
numerical minimum in the case of Type 4 (where this 
curve vanishes at the limits); and uniformly negative and 
continuously decreasing towards the upper limit of a in the 
case of Type 5, where this curve vanishes at the limits. 

In the case, therefore, of those curves most useful to the 
Actuary the function A* logy, computed for the ungraduated 
curve, enables us to select generally the fazmula mest suited 
to the series. For this purpose if the data are grouped it 
will generally be better to compute the approximate values 
of the central ordinates of each group by an interpolation 
formula, such as that given on p. 57. 

Other types of curves will sometimes be found useful 
besides those arising from the differential equation on p. 39; 
but they do not generally lend themselves so readily to the 
method of moments. . 

If, for example, we write 


goerer\ncetana) 2 ‘Si . ae (14) 
we obtain, when m and n are numerically unequal, a skew 
curve vanishing when #=—aor —b. We may deal with this 


curve in practice by determining the values of equidistant 
ordinates as shown on pp. 57-8. Thus 


cy 


logy=a'— ew. | Cas) 
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As logy becomes —co at the limits, we multiply both sides 
by (a+2)(b+2), thence 


w[ab+(a+b)e+a?] 
=«'[ab+ (a+b)x+2*] —m(b+2)—n(a+a) 
=e Bret tr (G8) 0 ee 2). ey (18) 


where the unknowns are a, J, A, B and C. 

If we difference three times the right hand side vanishes 
and we have a series of expressions involving (ab) and (a+b) 
equated to zero and by suitably grouping these, or by using 
the method of moments a and b, and thence the remaining 
constants, may be evaluated. 

A similar process may be employed with advantage with a 
curve such as the usual form of exposed to risk or died, when 
the data are in large age groups. We may then take w in 
equation (15) to represent the common log of the ratio of the 
numbers above age a to the numbers below age z in the series. 
That is, if the total number in the series =N, the number 
above age a =Y, we may write 
m 2 


Y: BD Apa a ae 
ay)=U=* ~ spe7 bee: “aster (88) 


if 

- 10g N 

In many cases the constant K’ may be omitted if the 

number of groups is small; in this case C in equation (16) 

becomes zero. On the other hand it may sometimes be found 

necessary to add a term to the right hand of equation (16) 
involving 2. 


FOURTH LECTURE. 


——————————————— 


We shall now consider very shortly the problem of fitting 
frequency curves to statistical data. To do this at length 
would be impossible in the time at our disposal, and the 
student who wishes to pursue the subject in detail may read the 
original papers, already referred to (p. 39), of Professor Karl 
Pearson, to whom the development of the subject is due, 
or Mr. Elderton’s book. There are certain general principles 
however, which may be usefully considered. ‘I'he method 
usually employed in fitting these curves is by making the 
moments of the graduated equal to those of the ungraduated 
curve, which is ‘equivalent to making the quantities 
= (deviations), =? (deviations), &c., as far’ as S4 or SS equal 
to zero. This method may not always be the most convenient 
or the best for the purpose of the Actuary, but it is so 
for most statistical purposes, and has come much into use 
accordingly. 

We have already seen that, in the case of the curves 
arising from the differential equation on p. 39, expressions 
for the moments may be obtained in terms of the constants. 
which will enable us to determine the value of the constants, 
when the numerical value of the moments is known. For the 
purpose of fitting the appropriate curve to any given series 
of observations it is only necessary to determine the value 
of the moments as given by the observations, that is, the 
value of the sum of the squares, cubes, &c., of the deviations 
from the mean value of the variable. 

It will be useful to consider shortly the calculation of the 
numerical value of the moments in a given instance. Take 
first the stmplest possible case where we have to do not with 
& continuous curve, but with a series of points representing 
isolated ordinates, where in consequence we replace integra- 
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tions by summations. In the following table, the first column 
contains the values of the independent variable «, the range 
of values being from 0 to 6. The second column contains 
the values of its function y, which are proportionate to the 


6 
successive terms in the expansion of the binomial (5 + a ; 


the constant multiplier 729 being introduced merely to avoid 
fractions. The remaining columns, in which the average 
value of « and the values of the successive moments are 
worked out, explain themselves. It may be remarked that in 
this example the average value of 2, and the deviations from 
the average, are all integral, and it is therefore convenient 
to calculate at once the moments round the average value 
(centroid vertical”). In most cases, however, the average 
and the deviations will not be integral, and then it will 
be more convenient to calculate the moments round the 
origin or some selected middle value of the variable, 
afterwards transferring the moments to the mean by the 
formule given on p. 41. . 


* Taste VIII. 


Doments of the Point Binomial Curve. 
a s 
i6 2\2 7]\6-2 790 
99, —c _ (3) (5) ee -(2)2, 
ee jz|6—2\3/ \3 jz|6—z (2) 


y ry (x—4)y | (x—4)*y | (@—4)¥y |(z@—-4) ty 


8 


Ca 
Obviously, when the moments are calculated about the mean 
the first moment is zero (because it represents the average 
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deviation from the average value). The even moments are 
always positive, because each term is of the form Yt, 
i.e., essentially positive; and if the curve is symmetrical the 
odd moments vanish, because each term of the form y,#*"*" is 
cancelled by a term (equidistant from the mean) of the 
form y_2(—z)*"*1. In general, where the curve is not 
symmetrical, the third, fifth, &c., moments will not be zero. 

In the above illustration, we have considered x to have 
integral values only. This may be said to approximate to 
the conditions of many statistical tables used by the Actuary 
where « represents the year of age under observation, and 
where it is indifferent whether the observations are supposed 
to be spread over the year in the form of a continuous curve, 
or whether we consider them all to have reference to the 
central point of. the year. In these cases, however, x will 
generally have a large range of values, amounting possibly 
to 60 or 80, and the labour of computing the numerical 
value of the moments is then much lessened by grouping 
the facts in larger sections, though we cannot then safely 
assume the totals of each group to be concentrated at the 
middle ordinate. ‘ 

Take the set of observations in Table IX representing 
for decennial age groups numbers exposed to riek in the 


middle of each year of age, i.e., H,=B,—3., in the recent 


mortality experience of lives assured by ascending premium 
policies,* excluding the first ten years from entry. Here we 
have no longer the values of equidistant ordinates of the 
curve, but the area of the curve enclosed between successive 
ordinates. To obtain the moments of the curve with any 
degree of accuracy, we cannot treat these areas as 
proportional to their central ordinate. 

It will be noticed that the particular curve we are dealing 
with becomes gradually zero at either extremity,t and we may 
assume, without serious error, that it makes “close contact?” 
at either end with the axis of a that is to say, is 
asymptotic thereto. In these cases, Mr. Sheppard has shown t 
that very approximate values for the moments may be found 


* See Unadjusted Data, Minor Classes of Assurances, p. 191. 


+ We omit the numbers at risk under age 25 (arisine fro tr: : 
age 15), amounting to only 25 in all, : ¢ pi eh ase LL 


~ An elementary demonstration is given in Elderton’s Treatise, p. 28-29. 
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by treating the area of each successive section of the curve 
as concentrated in the middle ordinate of the section; in 
other words, treating the values of y as representing isolated 
ordinates exactly as was done in Table VIII; and_then 
applying to the values of the moments so found (denoted by 
the symbol m’) the following adjustments leading to the 
corrected moments denoted by the symbol m :— 


Md, = 0"; 


1 
12 


Mg = 'o— 


7. 


ne Be Ee 
Ma=M s— 7 ™ =N'3— = Mm, 


4 
1 


My=mM'.— 5m’ ee ‘ 
a 4 2 2 D) AQ) as 4 D) 80 . 
For moments round the centroid vertical these become, 
remembering that w,.=0, 


Nlg— 


ee _ z 
Ha=-.2 To 


bd a= p's 
— 2 ay! -_ 1 
tte BFS, 80 


TaBLe IX. 

Ascending Premium Assurances—Experience 1863-1893. 
Duration 10 years and upwards. 

Calculation of Moments of “ Exposed to Risk” Curve. 


! 


Exposed to | | 
Ages Risk z ry | ay ay aly 
kf ! 
25-35 2,874 | —2 | — 5,748 | 11,496 | —22,992 | 45,984 


35-45 | 22,020 | —1 | —22,020 | 22,020 | —22,020 | 22,020 
17,391 | 17,391 17,391 | 17,391 


55-65 17,391 A 

65-75 7,845 2 15,690 | 31,380 62,760 | 125,520 

75-85 1,761 3 5,283 | 15,849 47,547 | 142,641 

85-95 81 4 324 1,296 5,184 20,736 
Totals | 78,136 10,920 | 99,432 87,870 | 374,292 
Reduced , 13976 | 1:2725 11246 | 4-7903 
tounitarea mi =m’) =m =m’'s =m’, 


me a 
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From these results we obtain by means of the corrections 
above stated— 
m,='139763 me==1:1892 ; ms=1:0897 ; m= 41832. 
Whence, by the equations on p. 41. 
p= 171697 ; g='5965 5 y= 8°7122. 


If quinquennial age groups had been used, making due 
allowance for the unit of time still being taken as ten years, 
the corresponding values would have been 


m= 18848 ; po=1:1741; pe="5869 5 py=8°7160. 


using these latter values, as the more accurate, we obtain for 
the values of the functions 8,, 8, and y. 


, Bit4 _. 
B= p's/ 3.21283; Bo ps/p%2=2°6957 ; oars seals 


As yz does not vanish, and y is > =) we see from the table 
on pp. 140-1, that if the series can be represented by any of 
the curves there given, it must be by No. 4, excluding the skew 
binomial as unsuitable for reasons already given. It is also 
obvious from the run of the figures in Table IX, that the 
curve is limited in both directions. KEquating the expressions 
in Table IX with the above numerical valués, we have 


ak oe ='7397 ; whence n=7°18 
Br At 18 eae pee ; 
A= SETS gs 21288 ; 
whence (p—g)?="5453pq 


(p+ q)?=45453p9 =1 (since p+q=1) 
mens 
| (99) = | petee = 3464 
giving “p="6782; g='3268 
4 
a= ieee -a?=1'1741; whence a=3'293 
thus giving a range of 32:93 years on either side of the age 
for which the value of in the formula=0. This has nothing 
to do with the zero point (age 50) in Table IX. The mean 
age as is seen from that table is 50+1:385=51:385. The 
value of m, the mean as computed by the above formula, is 


m,=(q—p)a=—1:1407 
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that is, 11407 years earlier than the central point of the range, 
giving for the latter, 51°385+ 11-407 =62-79, say. The range 
of the curve is therefore from age 29°86 to age 95°72; and, 
computing the values of np—1l and ng—1, we have, for the 
final form of the equation of the curve, when «=the age: 
y =. (#—29°86)'3(95°72 —x) 3 

It is often a convenience, however, to have the values 
of the central ordinates of the groups, which may be 
approximately obtained by interpolation. If the numbers in 
any group are represented by the symbol w,, the number of 
years in each group being ¢, the value of the central ordinate 
of the group (that is to say, the numbers under observation 
exactly at the central age of the group) will be approximately 


1 A*uz_ ae : 
Face TA 1e, As, however, it is convenient to treat the 
interval ¢ as the unit, for the time being, we may write as the 


A*u,—, 
24, 

_ numbers for each group less ;4th of their respective central 
second differences). In the class of curves we are discussing, 
namely, those having close contact at both ends with the axis 
of x, the, numerical values of the moments :as deduced from 
these ordinates will be very nearly the values for the 
continuous curve, unless the number of- groups is very 


he 
small. Thus the values of | yds, and of the functions 
k 


values of the central ordinates u,— 


(the original 


k 
ordinates of y, computed as above, and the swm of the 
products ay, zy. 

An advantage attaching to the use of ordinates in lieu of 
areas is that, in the class of curves we are dealing with, we 
can, by examination of the differences of the logarithms of 
the ordinates, gain a better idea of the nature of the curve 
than can be obtained from the grouped figures. (See Third 
Lecture, p. 50.) It is also easier to compare the graduated 
figures as given by the frequency curve by means of isolated 
ordinates than by means of groups or areas. 


h he 
[ava | wydx, will be found by taking the swum of the 
k 


A%te-1 | AMx-2 


D4. Wise nearly, and in 


* The formula to 4th differences is wz— 


order that the resulting 4th moment should agree exactly with that obtained 
from the use of the grouped figures, or areas, with Sheppard’s corrections, the 
4th difference is required, but for practical purposes it is not often needed. 


58. | ee 


The use of the central ordinates of the groups nas 
the incidental advantage, which is very considerable 
in the case of a mortality or similar experience, of 
giving trustworthy values of the force of mortality, or 
corresponding function, for the ages corresponding to the 
position of the ordinates. In the usual plan of summarizing 
a mortality table by giving the numbers at risk and deaths 
in consecutive age groups, the ratio of the deaths to the 
numbers at risk in each group is not a useful function, as it 
does not correctly represent the mortality for the central age, 
except near the middle of the table, where the numbers under 
observation in successive years is nearly constant. 

We may apply this method to the example already dealt 
with on p. 55, viz., the experience of ascending premium 
policies. The calculations as set out in the following tabular 
form are sufficiently clear: 


TanLE X. 


Mortality experience of lives assured by ascending Premiums, 
1863-1893. Duration 10 years and upwards. 


*EsTIMATED CENTRAL | 
i } Central 


Espaed ORDINATES 

| | a > se oO : 

Ages | of cite Risk Died  gaeee fe enue Age 

| eA to Risk Died 

; @ | @ | (3) (3) i | er) ee 
25-80 | 275 | 266 2 168 8 0048 
30-35 825 | 2,607°5 81 2,448 29°2 0119 
8540 | 87°5 8,788 102 8,860 102-0 0115 
40-45 | 425 | 18,2325 173 | 18,389 1752 0131 
45-50 | 47°5 | 13,910 192 | 14,007 191°7 "0137 
50-55 | 52°5 | 12,234 218 | 12,284 218'6 ‘0178 
55-60 | 575 | 9,878°5 229 9,878 228-4 "0232 
60-65 | 62° 7,512°5 255 7,518 255°4 ‘0340 
65-70 |! 67:5 5,007°5 271 4,994, : 27474 "0549 
70-75 | 725 2,837 206 2,809 205°6 0732 
75-80 | 775 1,347°5 151 1,324 151" "1144 
80-85 {| 825 413°5 85 389 848 *2180 
85-90 | 875 | "7 24 66 22°3 "3379 
90-95 | 92-5. | 45 3 2 22 1:1000 
Totals... ... | 78,186 | 1,942 | 78,136 | 1,942-0 


* Taking 5 years as the unit, computing by formula xz— Afwa=) 


> where 


uy, represents the number in columns (8) and (4). By thi 
le ; . By this formula there are: 
coon a exposed to risk at age 22°5; these have been included in the 


wor ® e i Ly 
ys} Ar-H gab i Pel ttl ie as 
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If the values of the moments are computed from columns 
exactly as was done with the Binomial Curve (Table VIII, p. 53) 
they will be found to be practically identical with those found 
above. The estimated values of wu, for the central ages of the 
group are inserted as they will be used later. 


In many cases the principle of the method of moments 
may be used to fit a curve to a series of observations without 
actually computing the numerical values of the moments 
themselves, using instead the successive summations of the 
ordinates, or areas, from which the moments can be readily 
obtained if required. This method is also useful if one or 
both limits to the range of the curve can be assumed. 

Consider a scheme such as the following, in which, with a 
view to clearness, we use actual numbers of the series, given 
on p. 53, instead of symbols :-— 


In this scheme, each column is formed from the preceding 
by successive addition from the bottom, in the same way 
that the M, column is formed from C,, and R,z from Mz. 

If we take the value against e=0 in the column Su,, say 
Su, we see that each value of wz occurs once only in that 
total. In the total appearing against e=1 in the second 
summation, say %%m, each value of wz, occurs « times ; 
similarly the total against z=2 in the oa Pty, eay 
a(a— 


5 Ue and the 


33u, represents the sum of the products 
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total against z=3 in the column Stuz, say >4u,, represents 


«(w—1)(e—2) 
6 


the total of the products Uz, and. so on, the 


coefficients following the Binomial law. It is evident from 
this that the sums of the products a°uz, a2*u,, &c., are 
implicitly contained in these totals; and that if these sums 
of the graduated and ungraduated values are in agreement, 
the moments of the two curves will also agree. Writing mz, 
as the value of the mth moment round the ordinate of «=0, 
we shall find :* 


233g + S2u, 


Mg = 
3 Duy 


_ BS tug+ 6X3 + E22, 


m 
: Eup 


_ 2435u, + 36S4u + 143814 E22, 


‘+ 
Do “2 


These formule may be simplified if we avrite them in a 


form analogous to central difference formulee—writing, for 
example : 


S3u,i3 for : 


> Chat = Uy) 
» 


~~ 


these average values being shown in antique type in the 
Scheme. We then have, omitting the common divisor LU : 


M3 = BY + m, 


M= 24S 5210, + me 


The equivalence of the above formul i 
; # may be illustrated b 
the following numerical examples based on the above ee 


* See the demonstration in Note E, p. 124, 
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Using N as an abbreviation of Yw%=the total number of 
observations, we have 


N.im= 729= Zw 
N.m= 2916= %u, 
Nime= 12636=23%u, +32, =2x 4860+ 2916 


=233u, =2 x 6318 


N.m3= 57996=623'us +623. +3%,=6 x 432046 x 4860+ 2916 


= 62ue + 2u,=6 x 9180 +2916 


N.m,= 278316 = 2435u, + 3624205 + 1453 + Du, = 24 x 2160 


+ 86 x 4820+ 14 x 4860 + 2916 


= 2435, +233, = 24x 11070+2 x 6318 


The last may be compared with the direct calculation of 
au, given in the last column of the scheme. The values of 
the moments through the centroid vertical may be obtained if 
requiredsby the formule : 

pa=0 


[a= M— (m,)? 
a= ™M 3—3 (7) H2—(m,)* 
p4=™,—4(M) pig — 6 (1721)? p4g— (1). 


Where the number of terms in the series is few, there is no 
special advantage in this method ; but if the number of terms 
is considerable it effects a saving of time, more particularly 
if the calculation of the moments round the centroid vertical 
is not needed by the conditions of the problem, as in the case 
of the graduation of rates of mortality by Makeham’s or 
any similar frequency formula. 

The case of curves not making close contact with the axis 
of z at both ends requires to be considered separately, but the 
results obtained are not altogether satisfactory, see Elderton,. 
pages 29-30. The difficulty can, however, to a great extent 
be avoided in most cases arising in actuarial work by using 
very small groups, or even individual values for each year of 
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age, &c., in calculating the moments. The labour although 
increased is by no means prohibitive if the summation method, 
above described, be adopted. 

Professor Karl Pearson has shown* that the method of 
fitting a curve by computing its moments should lead to 
nearly the same results as the method of least squares. If we 
are fitting to a given set of observations an ordinary parabolic 
curve, represented by the equation y=a+b«e+ca?+ &e., then 
the method of moments and the method of least squares are 
identical.t He infers from this fact that, even if y is 
represented by a more complex expression, the numerical 
results from the method will be nearly the same as with the 
method of least squares. It would appear at first sight that 
the effect of the method of moments is to give equal weight 
to each observation or group of observations, in spite of their 
having unequal average errors; whereas the method of 
least squares should, strictly speaking, be applied only when 
the average error of each observation is nearly equal.t Ina 
mortality table, where the number of persons under observation 
and the number of deaths are relatively large in the middle 
of the table and fall off to zero at the beginning and €nd, the 
probability of a given error in the value of g is very much 
smaller at the central ages; while, on the*other hénd, the 
probability of a deviation of a unit in the number of deaths is 
correspondingly greater. The same applies to most tables of 
statistics, as they usually present a series starting from zero, 
rising to a maximum, and diminishing to zero again, the 
weight of the observations being in the middle of the curve, 
where, however, the probability of a given numerical deviation 
in the actual numbers is also greater. 

We have seen that in a series of numbers representing the 
distribution of a group into sub-groups the average error in any 


given case is approximately -8 a moan) where 7 is the 


number in the group and m the (graduated) number in the snb- 
group. If, as is generally the case, n is large compared to m, 


* Biometrika, vol. i, p. 266-271. 


Tf the moments are assumed to 
are introduced, the method of 


Squares ; see examples given by Todhunter, JI. cASs sl yrs nto 


t See Note C, p. 117. 
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this expression may be taken as equal to ‘8 /m, the average 
™ b a) 4 
error in the ratio i being approximately gvm Thus, if 
n 


the number at risk at a given age equals 1 and the true 
probabilities of death and survivorship, are g and p, then 
-8 /npq* (which as p is nearly unity for the greater number 
of ages may be roughly taken as ‘8vnumber of deaths), 
is an approximate expression for the average deviation from 
the expected number of deaths. The method of moments, 
if employed to represent a- given series by a parabolic curve, 
assumes an equal probability of unit error in each term of 
the series. If, therefore, the series is of such a character 
that the extreme values are relatively small, these parts of 
the data will have somewhat less than their due weight in the 
fitting process. If, however, the formula to be fitted does 
not represent a parabolic curve, but a curve analogous to the 
normal curve ke-*/“, say a curve of the form ett +c*+Sc. 
then it will be found that, on the assumption that the mean 
error in any value y is equal to /y, (where y, represents the 
graduated value of 7) the method of moments gives the same 
result as*the method of least squares when the observations 
are duly weighted (see Note F, p. 129). 


a ° 


We come now to the class of curves representing not 
the actual numbers in statistical tables, but the ratios of the 
corresponding numbers in the double series, such as those of 
tables of “Exposed to Risk” and “ Died ’’, curves, that is, 
representing such functions as rates of mortality, of marriage, 
of lapse, of superannuation, &c. The most interesting and 
important of these is the curve due to Makeham’s development 
of Gompertz’s hypothesis, in which the force of mortality at a 
given age a is represented by the expression 


Pe=At Be* 
leading to the equation 
log 10 la=K + A’e + B’c*. 


This curve has a double value as, apart from its use in 
graduating a mortality table, it has the valuable property 


* See Note A, p.110; J.IA., xxvii, 214. 
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that the values of annuities on x joint lives of various ages 
ean be found from a table of single entry showing the values 
of annuities on 2 lives of equalage. Owing to its importance 
it will be useful to give some attention to the problem of 
fitting this curve to a mortality experience. We will first 
consider the case of an aggregate or non-select table, that is, 
a table in which the rate of mortality is a function of the age 
alone. 

Various methods have been employed to obtain the values 
of the constants A, B, c, corresponding to a given experience. 
That used by Makeham, and subsequently in a modified form 
by Woolhouse, is based on selected values of log/, taken from 
a table already graduated by a finite difference formula. 
Four values of log/, may be taken, covering practically the 
whole of adult life, say the values at ages 20, 40, 60, and 80, 
or 25, 45, 65, 85. Either set are sufficient to determine 
the four constants, K, A’, B’ and c, as above. In Woolhouse’s 
graduation of the H™ Table, both of these sets of ages were 
employed, the most advantageous values of the constants 
being found by comparing the deviations between the 
graduated and ungraduated values of J, at quinguennial 
ages according to the two preliminary graduations. If a 
single set of four values of J, is taken as the basjs of the 
graduation, the effect is the same as employing the sums of 
the forces of mortality (u2,:) between the selected ages, 
giving equal weight to the values at each age. 
> The method employed by Mr. King in the Institute of 
Actuaries’ Text-Book, Part II., substitutes for graduated 
values of log 1, at isolated ages, the sum of certain 
groups of the ungraduated values of logi,. The effect 
of this method would appear to be to give a diminishing 
weight to the values of yw. for the ages at the commencement 
and end of the table, which is so far in accordance with 
theory, and to eliminate the effect of errors in isolated 
values of I,. In Biometrika (vol. 1., p. 298-808) Prof. Pearson 
has dealt with the same problem, basing the values of the 
constant upon the successive summations of log ls. 

It is, perhaps, preferable to deal. directly with the actual 
exposures and deaths in a manner similar to that first 
described by Makeham (J.Z.4., vol. xvi, p. 344). This can 
be readily done, and the same method of summations or 
moments applied as in the case of any other frequency curve. 
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Tabulate E,,;, that is, the number exposed to risk in the 
middle of the year of age a, and 6, representing the deaths 
occurring between agesvanda+1. Assuming, as we may with 
sufficient accuracy for ordinary purposes,* that the force of 
mortality at age «+4, or the function colog .pz, is equal 


to mz the “central death rate” = apie ~-, we have 
: ya 


Ex.3(A+ Be+})=6,. 


If.we knew the value of c, we could then tabulate the values 
of E,43, Ez,3c7t?, 6, respectively, and summing these values: 
continuously to the end of the table, and again taking the 
total of these sums, we should obtain equations in this 
form :— 


(SEz.3)A + (2(Br+3c7*}) )B=(20z) 
(SZEL43)A+ ([2B.43¢744)B= (50,) 


a simple simultaneous equation for determining A and B. 
As a matter of fact, the value of logic does not usually 
differ very much from ‘04, and in general it will be found 
that a small change in the value of log ¢ does not involve a 
serious change in the general character of the table. In an 
importantseries of observations, however, we cannot assume 
the value of c. Either we must determine ¢ by a method 
such as that used by Mr. Woolhouse or Mr. King, which will 
give a sufficient approximate value, or we may adopt two or 
more alternative values of c, which appear likely to contain 
between them the true value. Having obtained the values 
of constants A and B for each given value of c, set out 
the expected or graduated deaths, and compare them with 
the actual numbers in suitable age groups. If the 
third summation of the differences of the graduated and 
ungraduated deaths is computed, it will be possible by 
# Assuming the usual table of E, and @, to represent accurately the facts 
and -to be undisturbed at the older ages (where alone the point is of any 
importance) by entrances or by exits other than by death, then . =Fxr 
92 
E,—12— 


central death rate” E, oe The error caused by omitting, the small term 

in the denominator and taking colog Pe eer is only Macietls at the 
zx 

older ages, amounting ie 1 per-cent in the rate of mortality Aste Gn='3 

or about - age 90. : 


acurately; and colog .p,;= me,’ very nearly, where my, is the 


W 
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interpolation to obtain a value of log c, making these cee! 
equal to zero. Putting the matter mto the language 0 
moments, we shall then have made the first, second and 
third moments of the graduated and ungraduated curves 
equal, and in that way we shall have selected what may be 
considered the best values of the constants A, B and c.* 

It may be objected that the use of this particular method 
is open to the same implication of giving equal weight to all 
the observations, as in the case of the values of J. We can 
avoid that objection by duly weighting the observations at 
each age by multiplying the “exposed” and “died” at 
each age by the approximately graduated values of (@,)~?. 
But although this would give suitable weights to the 
observations, if the curve of mortality were a parabolic 
curve, or if it were known to follow accurately Makeham’s 
Law, it is not quite clear that it would do so in practice. 
It may be assumed that (when the constants are formed 
by reproducing the moments of the deaths) in not 
weighting the observations, we give less weight to those at 
the commencement and at the end of the table than they are 
theoretically entitled to. But this is not a serioustpractical 
objection. Makeham’s law is only approximately correct, 
and as we reach younger adult ages it begins to diverge from 
the facts of observation; on the other hand, as we reach the 
older ages the actual importance of the observations is less 
than the weight to which they are theoretically entitled, as 
estimated by the number of deaths, owing to the fact that 
the actual mortality at those ages does not materially affect 
financial questions such as rates of premium and reserves. 

Beyond this consideration there is also a degree of doubt 
attaching to the rates of mortality at extreme ages in any 
_table.t Indeed, we may go further, and say that in all 

considerable tables of statistics the numbers at the extremes 
of the table are proportionately more affected by sporadic or 
accidental errors of observation than those in the body of the 
table. If we suppose that in a very small percentage of 
cases the ages of the “Exposed to Risk” and “Died” are 
affected by errors of calculation, clerical errors “in tran- 
scribing the data, &c.—these cases being removed from their. 
true position | and scattered at random over the table—the 
* See Note G, p. 131, 


t See my notes on this subject in “ Principles and Methods *, p. 148. 
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effect upon the data over the great bulk of the table will be 
insignificant owing to the large numbers under observation 
and to a balance of errors, but the effect upon the experience 
at the extremes of the table, where the actual numbers under 
observation are very small, may well be appreciable. 
Reverting to the problem of obtaining the value of c in 
Makeham’s formula directly from the observations, we may 
endeavour to represent the curve of the “‘ Exposed to Risk” 
by some frequency curve which can be suitably combined 
with the formula for yw, to represent the deaths—such, for 
example, as the normal curve y=ke~**, or: the curve 
No. 5, y=ke"er, or by the terms of a binomial expansion 
(see Calderon, J.[.A., vol. xxxv, p. 157). Unfortunately none 
of these curves give a very satisfactory representation 
of the average form of the “Exposed to Risk” curve. 
In the case of the binomial, in order to get a tolerable 
fit, it will be generally found that the value of 7 in the 


z 


expression (representing the general term of the 


ee 
‘ binomial) must be taken small; that is to say, the data must 
be arranged in somewhat large groups of not less than about 
10 ages to a group. In either case it will be necessary, after 
obtaining” a frequency curve fitting the numbers of the 
“Exposed to Risk,” to re-compute the deaths on the basis 
of these graduated numbers. 

Thus, while it is possible to determine the values of c¢ 
directly from the observations, the process is laborious. In 
my opinion, it is preferable to use certain trial values of « 
which we know to lie near the truth, and, by a comparison of 
the resulting graduated deaths with the original facts, to 
select a value which appears to give the best generai 
agreement, which may not always be that making the third 
summation of the deviations zero.* 

There is a further point to be considered with respect to 
the nature of the differences between the original numbers, 
whether of deaths or of other observations, and the numbers 
obtained by a graduation following a formula such as that of 
Makeham. These divergences between the ungraduated 
and graduated numbers will in part arise from the smallness 
of the numbers under observation, and may in part arise 
from the fact that the formula does not accurately represent 
@ See Note G, p. 133. 
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the true curve of mortality. For the majority of mortality 
tables, for male lives at the adult ages, Makeham’s formula 
ss so near the truth that we may in practice neglect the 
svstematic errors and assume that the formula represents 
the true curve of mortality, determining our constants as 
though the whole of the deviations in the graduated and 
ungraduated curves are accidental and due to the smallness 
of the data, but for some tables, notably those representing 
the mortality of females, this will not be the case. 

Other expressions may be given representing approximately 
the curve of y,, as, for instance, 


fesmat+Hnb? 0. ce SR et Oe ee 
whence 
logiple=K +Mat+N . 2.) - = 2} 

an expression which enables us to represent some mortality 
iables, such as those arising from tropical experience, that: 
are not very readily represented by Makeham’s formula. 
The values of these constants can be readily obtained either 
from 5 selected values of log J, or from the sums of the values. 
of selected groups of the same function. 4 

The above formula for J, preserves in a modified form the 
principle of uniform seniority. Not, hewever, m a very 
practicable shape as in order to compute values of joint-lives 
(any number) we require tables of & joint-lives of equal age 
for various values of k. It is of course evident from general 
considerations that the force of mortality on any number of 
joint-lives must consist of two terms, each of which is 
a member of a geometrical progression, and that if we can 
find an age w where the relative values of these two terms is. 
the same as in the joint-life status, the actual values will 
be the same when multiplied by some suitable constant k- 
The required joint-life annuity will then be represented by 
the annuity on & joint-lives all of age x. 

Take as an example an annuity on the joint-lives of (a) 
and (y). Find & and w so that 


oF + at hot whence w= !28(e% a") —logt +9 
. loga—logb 
t+ b= hoe |and k= (a + a) a" = (b? + 39) +b" 


Then it is obvious that if we replace # and y by «+% and y+t, 
k will remain unaltered and w will become w+t,so that the 
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principle of uniform seniority is maintained. Thus, an 
annuity on # and y will be equal to an annuity on k lives all 
aged w; or, since k will not generally be integral, it will be 
more convenient to say that a,y=a'» where a’ is calculated at 
forces of mortality always k times the normal force, age for 
age. Thus, we shall require tables for various standard 
values of k, and we shall usually require a double interpolation ; 
since neither w nor k will usually be integral. 

The principle of employing the sum of two (or more) 
geometrical series to represent the logarithm of a function 
such as the number living may also be used with advantage, 
as will be seen later on, for census tables. (See the Sixth 
Lecture.) 

As an example of this formula, we may apply it to the 
column of log/,in the O™ Table. 

Taking the values of log /, for ages 20, 37, 54, 71 and 88, 
we have the following data: 


log I~ = 498432 =K + Ma + Nb™ 
- log l;=494279=K + Ma” + Nb? 
- logl.=485300=K + Ma*+Nb* 
log 1, =458086=K + Ma?! + Nb” 
log lee=3°47509 = K + Ma** + No 
whence differencing, and writing 
Ma*(a”—1)=—M’; No®(b7—1)=—N’; a7=a; bV=B; 


we have . 
M’+N’=log ly—log y= -04153=A 


M’a+N’B=log l,—logl,= °08979=B 
M’a?+N’P?=logl,—logli= *27214=C 
M’a? + N’B§=log 1, —log leg=1-10577=D 

whence, noting that 


BD—C? AD—BC 


mer Papp nets: 
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we easily obtain: 
a=5'1082 ¢ preel 100s 


B=15243 ; b= 1:0251 
M’= 0073886 ; M=—-00026403 
N’= 0841414; N=—-039657 


The following comparison of the values of J, and 
decrements for quinquennial ages will indicate the approxi- 
mation of the formula to the O™ Table. 


Taste XI. 


Values of lz and of (lz—lz+s) according to the O“ Table, as 
compared with re-graduation by formula (2). — 


i 
ie QUINQUENNIAL DECREMENTS i 

re aa Se eS 

Age j ERRORS 

By Original By i; Original, | Ba a BS 

Formula ! Value Formua :; Value ‘¢ | 
| a ial 
i] 


20 | 96453 96,453 | 2,129 . 2,066 | 63  .. 


25 94,824 . 94,387 2,467 2,445. 3 ae 
30: 91,857 . 91,942 2,896-} 3.9877 | Loe yl GL 
35 | 88,961 | 88,995 $3,448 : 3,688 : ... : 85 
40 85,518 . 85,467 A158 era SOS utter aa) 
45 81,360 81,262 5,108 5,077 Se cee i 


50 . 76,252 76,185 6,350 6266 84° .. | 
55 69,902’ 69,919 7,927 7,846 | 81 


60 61,975 62,073 | 9,775 | 9,766 | 9 
65» 52,200 | 52,807 | 11,606 , 11,692 | 86 
70 .. 40,594 , 40,615 | 12753 ; 19,863 | 


c i 
4 aig ae 
: ° 


73 97,841 | 27752 | 1B 199 | Is;922 | | | 30 
| 


80 15,649 | 15,530 9,244 9,171 .. 73 a 
85° 6405 : 6,389 4,827 4,763. | 64 

90 ' 1578 1,596 1,406 1,410. 4 
95 172 | 186 ier (179 } + 42 
100 5 | 7 ae 7 2 


FIFTH LECTURE. 


AuroucH in the preceding Lecture the application of 
Makeham’s formula has been considered at some length, it& 
importance is such that we may now touch on some further 
points, and particularly on the. application of the formula to 
the graduation of select tables. 

The suitability of Makeham’s formula to the at 
of mortality tables must be judged as we aoalde judge the 
applicability of any other frequency curve to a given series of 
observations. That-is to say, we must consider whether the 
’ observed differences between the graduated and ungraduated 
values (the computed and actual deaths) fall within what 
may be properly considered to be the limits’ of error. 
For practical purposes, owing to the great. convenience 
attaching to the use of the formula, it is worth while to 
stretch a point in its favour. Instead, therefore, of merely 
considering the closeness of the agreement between the 
actual and computed deaths, we may consider how nearly the 
ungraduated and graduated monetary functions, such as the 
values of premiums or annuities, are in agreement. If this 
agreement is sufficient for our purpose, we are justified 
in adopting the graduation as given by the formula, 
notwithstanding the fact that at certain groups of ages the 
divergences between the graduated and ungraduated deaths 
may be greater than would be expected from the theory of 
probabilities. In this connection it is to be noted that our 
observations relate to past time, and that the quantities we 
are. measuring are all liable to change with time. Hence ina 
graduation intended to form the basis of tables of annuities or 
premiums, it is sufficient if the general character of the 
experience..is retained without insisting too strongly upon a 
strict. adherence to minor features. This is illustrated by the 
following table from “ Principles and Methods” (p. 162), in 
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which we may anticipate for the moment the question of the 
application of Makeham’s formula to select tables : 


O™) Whole-Life Participating—Males. 
3 per-cent Premiums for 100 Assured. 


: -U (3f]— OLMI 
| Pra) if | Sprague’s H ra (3) 
, Age ri’ ne HI select if 
Ungraduated | Graduated + y te ee boss. 
| @) (2) (3) @) | (6) (6) @ 
j 2 . . . *j 8 
{2 1:379 .| 1:365 see O14 1/563 

oH 1535 1551 ‘016 ne 1-703 "152 

30 1779 «|. 1°785 -006 ie: 1-925 “140 
I 898 2-086 2081 ne 005 2-218 137 
: 40 | 2453 | 2-457 -004 a 2-602 145 
1 45 2-952 2940 os 012 3106 "166 
' 50 3571 3564 _— 007 3°755 "191 
} 55 4338 4:377 039 a 4635 "258 
1 60 5413 5-446 033 ARS 5°827 381 
| 65 6872 6854 Ps 018 7-433 “579 
| Average| 3238 3-222 -004 = 3-477 “235 


. 

Here columns (4) and (5) show how far the graduated select 
annual premium P,,,, for each age at entry, differs from the 
ungraduated value for the same age, while column (7) shows 
how far the annual premiums deduced by Dr. Sprague from 
the H™ data (Journal of the Institute of Actuaries, vol. Xxu, 
p- 391) differ from the preminms deduced from the O™) 
Experience. The average difference between the graduated 
and ungraduated premiums (irrespective of sign) amounts to 
"015 per £100 assured, a quite insignificant amount; whereas 
the difference between the premiims representing the earlier 
experience and those of the O™ Table, representing the 
experience of 30 years later, are all positive and average ‘235 
per £100 assured. . 

Only a part of the differences shown in columns (4) and (5) 
are due to any systematic difference between the mortality 
as shown in the OM data and that assumed by the formula. 
Assuming, however, that the entire differences were due to 
this cause, it will be seen that the changes introduced ‘into 
the values of the monetary functions by using Makeham’s 
formula aré a very small percentage of the actual change 


that has occurred in the value of these functions during the 
course of 80 years, ot | 
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Although, therefore, the differences between the graduated 
and ungraduated deaths do at certain points somewhat 
exceed the limits of the errors of observation, we are justified 
in using the graduated table as a standard for the future. 

Each case must, of course, be decided upon its own merits, 
and while the H™ Experience and the O™ Experience have, 
with other tables, proved to be amenable to Makeham’s 
formula, the latter cannot be treated as a “law of mortality ”, 
to which all tables may be expected to conform. As already 
stated, its suitability must be tested, as that of any other 
frequency curve, but with rather more latitude owing to its 
practical advantages. In particular the formula ‘is not 
generally suitable for tables representing the mortality of 
Female Lives. 


In the last lecture we considered various methods of 
determining the constants of Makeham’s formula for py» best 
representing a given mortality experience, in particular 
. that depending upon the agreement between the totals of the 
graduated and ungraduated deaths and of their successive 
summations. We have so far, however, considered the force 
of mortality as a function of the age only, so that our results 
are applicable only to “mixed” tables of mortality, not to 
“select” tables in which the mortality is treated as a function 
both of the age of the life and of the duration of the 
assurance. 

The formula owes its value, beyond the incidental 
advantage that it gives us a very simple and effective 
method of graduation, to the relation it establishes between 
the value of an annuity upon joint lives of any age and that 
of an annuity upon the same number of joint lives of equal 
age. From the formula for the force of mortality according 
‘to Makeham’s hypothesis 

Mer=A+Be* 
it follows that the force of mortality for any number of joint 
lives, aged, for example, at entry 2, y, z, is given by the 


formuia 
Parset Myset Meet =BA+ Be (7 +e +c) 


=Bpw+ : 


where . c@= 5 (c* +c + c*) 
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where é represents the period elapsed since the date of entry. 
As a value of w satisfying this equation can always be found, 
and is independent of ¢, it follows that 


Cryz= Qeew 


It is seen that the relation subsisting between the value 
of w and the values of 2, y, z, involves the constant ¢ only, 
and not the constants A and B; hence, any variation 
introduced into the values of the constants A and B, having 
reference to the time elapsed since selection and depending 
only on ¢, will not affect the relation between the age w 
and the ages z, y, and z. We can, therefore, write the 
force of mortality at age «+t for a life select at age a as 
follows : 


Myjre=Atsf()+[(B+e@)]erte . . . (I) 


and still retain the relation 


Moyet F Myst t Bet Sete} 


when = : (c?+c% +c). 


e = 
Equation (1) may obviously be written in the form 


Proriie Veas! aaa eae a) 


or alternatively, if, as is often more convenient, we work with 
the values of colog ps, in the form 


COlOG 9 P-r4¢= ar + Byct +t bik Gy (GB) 
where A, and B,, or a; and ®,, may be any functions of ¢, but 
are not functions of r. We can thus represent the rate of 
mortality as a function both of the age and of the time elapsed 
since selection and so approximate fairly to the rates of 
mortality shown in an “ analyzed ” or “select? mortality: 
experience, while retaining most of the advantages arising 
from the use of Makeham’s formula. The two functions of ¢ 
have probably a tendency to become constant as ¢ increases 
but do not ngcessarily become so within any special period 
from the date of entry ; they may continue to change slowly 
throughuut the whole duration of the table, and in theory, no 
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doubt, should do so, but for price purposes it is convenient 
to make them constant after a few years (say 5, or at most 10) 
from the date of entry, beyond which point it is assumed that 
the effect of “selection ” has worn off. 

If we set out separately the data for each year of assurance, 
that is, for each value of ¢ so far as we intend to trace selection, 
we shall have a series of equations (corresponding to those 
shown on p. 65 for an aggregate table) for determining the 
numerical values of the functions f(0), f(1), &c., (0), $(1), 
&c., the value of ¢ being necessarily that determined for 
the “ultimate” table. In other words, the data for each 
year of duration are treated as representing a mortality 
table complete in itself. We obtain in this way values for A, 
and B, or for a; and #; for each value of ¢t,so far as it is 
proposed to carry the select tables. Unless, however, the 
experience is a very large one, these values will be very 
irregular. Indeed, in the case of the O* data, which repre- 
sent a large experience, we have somewhat irregular values 
for a, and 8,, even during the first ten years of assurance, 
where the facts are most numerous. The approximate values 
of a; and @; for the O™ data are given on p. 157 on “ Principles 
and Methods.” If these values are plotted out, the resulting 
curves exhibit cértain obvious characteristics, as will be 
seen by the diagrams opposite where the regular lines show 
the ungraduated, and curved lines graduated values of a; 
and §;, and the horizontal lines after 10 years represent the 
values for the experience of 10 years’ duration and upwards, 
when they are assumed to be constant. A period of 10 years 
would appear from the data to be the shortest within which 
we can effect anything like a smooth junction between the 
“select” and “ultimate” mortality rates. 

-The values of a; rise very rapidly in the first few years 
of assurance, but after about 6 or 7 years they appear to 
approach nearly their final value. In the case of ;, however, 
we see that if the graduated curve were. drawn as closcly as 
is consistent with smoothness through the ungraduated values, 
it would probably not reach the level of the ultimate value 
0000466 until after 15 years from entry, and even then it 
would be below the value of #; for durations of 15 years and 
over. Hence it would seem that the value of 8, does not 
become constant until about 20 years have elapsed from the 
date of. entry. We may almost say that while the effect of 
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selection as reflected in the values of the a constant disappears 
after about 7 years, the effect upon the values of B probably 
continues throughout the whole of life. The explanation is, 
no doubt, that the a or A constant represents mortality from 
accidental causes and from non-constitutional diseases of short 
duration, whereas the @ or B constant represents mortality 
due to diseases of longer duration and to constitutional 
defects. é 

Having obtained numerical values of a, and £, for 
successive values of t, it remains to represent these values 
by convenient formule. The fact that the function 8, does 
not reach its ultimate value at the end of 10 years from 
entry, involves either some sacrifice of the agreement between 
the adjusted and unadjusted values of this function, or a 
continuation of the analyzed mortality rates beyond the period 
of 10 years, which is not very convenient. In consequence of 
this fact we cannot apply the method of moments in fitting 
a graduated curve to these values. Where the fitting of a 
frequency curve involves any systematic departure from the 
original facts, the method of moments often gives 
unsatisfactory results, and a curve may be produced 
departing more widely from the observations than if derived 
by a tentative method. . c 

In selecting formule for graduating the rough values 
of az and §;, there are certain conditions which should be 
fulfilled : 


1, A smooth junction between the curves representing the 
select and ultimate tables. 

2. An agreement between the graduated and ungraduated 
values of a, 8: in year 0, as a special importance attaches to 
the rate of mortality in the first year of assurance. 

3. An agreement between the aggregate graduated and 
ungraduated values of these functions during the period 
between the date of entry and the ultimate table. 

To conform to these conditions as far as ‘possible, we 
must select a curve for the values of 8, which, whilst 
running smoothly into the constant value at the end of ten 
years, will represent fairly well the distinctly lower values of 
Az in the years immediately preceding. This may be done by 
representing the difference between log lat (the value of this 
function in the “ultimate” table) and log qi4; (the value in 
the “select” table) so far as this difference js due to changes 
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in B,, by an expression of the form 1(10—t)2@c*, where @ is 
the ultimate value; whence we have the corresponding 
difference : 

Bett Kpol+t 


= 2n(10—2)8c* 
=2n(10—t)c~*Bcr*t 
so tha: ° A.=[1—2n(10—t)e-218. 


The result of this is to eliminate from the @ constant at 
the latter durations part of the effect of selection, and 
somewhat to exaggerate the effect in earlier years. 

We have now to decide as to the curve best representing 
the values of a;. The method employed will depend very 
much on the character of the experience we are treating. In 
the O° Experience it was again found convenient to adopt 
an expression for the difference of logioly,, and log wlte}4¢, SO 
far as this difference was due to change in a,, containing a 
term similar to that due to @; with the addition of a further 
term repwesenting a geometrical series rapidly diminishing as 
t increased. The final form of the equation for the OM! 
Experienee was a% under— 


log solx}41= 10g 1ol,44—10(10 — t)? — m’ (c’)*—n(10 —t)*Bcr. 


Having determined the form of this equation, the simplest 
method for determining the constants is to express in terms 
of them the difference between the computed deaths by the 
ultimate table of mortality, and the actual deaths for each age 
or each group of ages and each year of assurance. 

We have in that way a series of equations for determining 
the values of these constants m, m’, c’, n, and hence of 
A, and B, for each value of ¢, similar in principle to the 
equations used for determining the values of the original 
constants A and B. ‘The only point that arises is as to what 
particular way we are to group the observations to determine 
those values. 

The value of m in the above formula having been 
ascertained with a view to representing as nearly as 
practicable the effect of selection upon the constant B, there 
remain in all, four unknown quantities in” the formula 
to be determined, and the actual equations used to 
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determine them were formed by taking the first and second 
summations by ages of the whole of these . expressions, 
representing the difference between the “select” and 
“ ultimate ” rates, first for year of assurance 0 alone, and then 
for the whole of the ten years. . 

The selection of these particular groups is, of course, not 
a question of principle, but of convenience. Hach case must 
be treated with reference to the nature of the curve of 
selection, as brought out by the statistics, and such a process 
adopted as appears to be calculated to bring out the best 
results in the particular case in question. 

It may happen in certain tables that it is inconvenient to 
trace out the effect of selection for so many years, and in 
particular thisis the case in a table representing the mortality 
of annuitants. In such a table the effect of selection (which 
is here the self-selection of the annuitant) persists for a very 
long time. In a table of insured lives, owing to the cessation 
of new entrants in middle life, practically at about age 55, 
the mortality at the older ages is but slightly affected by 
selection. In the case of annuitants, where there is a 
constant inflow of fresh lives up to 75 or 80 yeans of age, 
the mortality is affected by this cause throughout the whole 
extent of the table. To completely represent the-effect of 
selection in sich an experience will require an elaborate series 
of tables, showing for each entry age the value of annuities 
for each year elapsed since entry for many years duration. 
The tables given in “ Principles and Methods”, pp. 124, 125, 
show that as regards the O™ and O” Experience, and doubtless 
the same feature would be found to be general, the values of 
the expectation of life ten years after entry are appreciably 
greater than the values for the same ages derived from the 
“ultimate” rate of mortality (¢j410 > 2410). Consequently, if 
the graduated rates of mortality for the first five or ten years 
from entry are employed in conjunction with rates representing 
the aggregate mortality after five or ten years, as the case 
may be, the ultimate values of the annuities, and also the 
values of the date of entry will on the whole be under- 
estimated. In. any table used for the grant of annuities 
it is, however, most important that annuities at the date of 
entry shall not be undervalued, and of only less importance 
that the values in succeeding years shall be such as may 
be safely employed in estimating reserves. Any method, 
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therefore, of treating an annuity experience which tends to 
underestimate the values of annuities is clearly unsuitable. 
Full weight must accordingly be given to the effect of 
selection, but to avoid the heavy work involved in a complete 
analysis, the expedient may be adopted of computing a 
hypothetical table of mortality which will correspond ine 
values of the annuities, let us say, five years from the date 
of entry. If this can be done successfully and the rates of 
mortality for the first five years joined on smoothly with the 
rates in such hypothetical table, we shall then have a correct 
measure of the value of annuities at entry and for the five 
years following, while thereafter the values will be slightly 
but not seriously, overestimated, an error which will be on 
the right side. 

We may take as our basis either the values of. the 
“ expectation of life” or of the annuities at a suitable rate 
of interest. We will assume the former to be adopted. As 
these values (€;;;5) will depend upon separate groups of data, 
viz., the entrants at individual ages, it will not be practicable 
to construct an ungraduated table of p, from the formula 
: er @ 
team | + @x+1 
leading to anomalpus results. A better plan will be to 
graduate the table of expectations. For this purpose, we 
may assume any frequency curve which will represent these 
expectations satisfactorily, for example, a curve such as 
logipe2= a+ ba + cx? +da*+fa*. We may employ values of e, 
deduced from the experience of individual ages at entry, 
or we may combine the entrants in quinary groups of ages, 
taking due account of the true average age of each group 
of entrants. 

_ The only point of importance where difficulty arises is the 
weighting of the different equations. These are not of equal 
weight because the expectations of life, as deduced from the 
unadjusted experience, are based upon a smaller or larger 
experience, as they fall at the extremes or in the middle of 
the table, and some method must be devised for giving due 
weight to this fact. This may be done by simply weighting 
the equations with the actual number of entrants at that 
particular age, and much may be said for this method 
although it slightly underestimates the. weights at the 
extreme ages. If we are dealing, for example, with the 


the irregularities in the individual values of e, 
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values of annuities, and approximately the same result will 
be arrived at when working with the expectations of life, the 
plan of weighting the unadjusted values in proportion to the 
number of lives entering at each age, would make the total 
cost of all the annuities by the graduated table the same as 
by the ungraduated, an agreement that would have some 
practical value. In the alternative, we may consider that 
each value of the expectation of life (or of the annuity, as 
the case may be) should be weighted in proportion to the 
reciprocal of its average error. Thus if e445=A+z, where 
A is the observed value and z the average error, we shall have 
Ceres AY It is difficult to determine satisfactorily the 
z 2 oe . 
average error in the value of the unadjusted expectation of 
life,* the problem being complicated by the incompleteness 
of. the observations due to the “existing”? <A fairly 
satisfactory method of estimating the average error would 
be as follows. Taking the series consisting of the values 
of ez for all values of , each of those values depending on 
a given age at entry only, we may assume that the observed 
second differences of these quantities ez_,—2¢.+én41, which, 
in a well graduated table, would be very small, are due to 
the errors of observation in the values ey, @x, and ey4,. In 
any particular group of entry ages, we may say that the 
average of the central second differences (taken irrespective 
of sign) will be, on the average, proportional to the average 
error in e, for that particular group.t Computing the average 
values of the central second differences (without sign), for 
various sections of the table, and drawing a smooth curve 
through them, we should obtain values from which suitable 
relative weights for the individual observations could be 
deduced. 

This would be a very fair method of determining practically 
the weight to be attached to the values of e, in different parts 
of the table. Or we may proceed, as was actually done in 
the case of the annuity experience graduation, by assuming 
the error in the value of e, to be a function, first, of the total 
number of deaths in the experience representing the particular 
entry age, and secondly, of the age 2. This method may 
appear somewhat arbitrary, but as only the relative weights 


* See, however, the Sixth Lecture; pp.-100-104. 
+ The average value of e,.1—2e, + ex1 Will be 4/6 times the average error in ez. 
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are in question, it is sufficient for the purpose. It must be 
understood that the relative weights adopted do not very 
greatly affect the results. The values of Makeham’s constants 
as deduced, for example, from the values of logl, for ages 
25, 45, 65, 85, thus giving equal weight to the observed 
value of mortality from ages 25 to 85, would not generally 
differ materially from the values resulting from a careful 
system of weighting, although, of course, the latter are to 
be preferred. 

Assuming the “exposed to risk” to remain unchanged, 
the average error in the observed number of deaths is 
approximately +°8/ng(l—g) where m is the total of the 
“exposed to risk” and ng the total deaths. The average 
percentage error in the total deaths will, therefore, be 


proportionate to + “J “a! - If we suppose that this average 


error is distributed uniformly through all ages passed through 
by the particular group of entrants, we can then arrive at a 
rough estimate of the average error in the observed value of 
€z, by computing the effect of-a change of, say, 1 per-cent in 
the mortaligy rates throughout. 

The assumptions here are not strictly accurate, as errors 
in the valye of ¢, arise not only from the total number 
of deaths being greater or less than the expected amount, 
but from the manner in which the excess or defect of 
mortality is distributed through the table. The neglect of 
this second source of error will not, however, seriously affect 
the relative weights arrived at, and for practical purposes the 
relative average errors in the value of e, will be dependent, 
first, on the average error in the total deaths observed in the 
experience from which it is deduced, and second, on the 
extent to which a given percentage error in the mortality 
distributed uniformly through the table will affect the value 
of e,. The product of these two factors may be taken as 
representing sufficiently approximately the expected error in 
the value of ¢,, remembering always that this estimated error 
is not an absolute, but a relative measure at the various ages. 
When this is done, we have, by taking the reciprocals of 
those quantities, the weights which we shall give to the 
observed values of e, in order to determine our constants. 

It is necessary to point out that this process, while suitable 
for expectations calculated from entrants at a particular age 
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or small groups of ages, will not apply to aggregate tables; 
for in their case the percentage error in the total deaths 
above age 2 steadily increases as x increases, so that this 
method would produce weights steadily diminishing from the 
youngest age to the oldest, which would obviously be 
incorrect. 


Notwithstanding the important effect of selection on 
mortality, it is frequently ignored, as in the H™ and O*™ 
Tables. It is important to consider, therefore, what is the 
net effect in a mortality table of neglecting altogether the 
factor of selection. Considerable additional labour attaches 
to the use of select tables for valuation purposes, and the 
question may be asked what kind of errors do we make if we 
neglect the fact that mortality is a function not only of the 
age, but also of the duration of assurance, and treat it simply 
as a function of the age as it is treated in the OM and H™ 
Tables. In a mortality table representing assured lives the 
effect will be seen if we compare a table like the H™ Table 
with a table like Dr. Sprague’s Select Table, or if ye compare 
a table such as the O™ Table with a table like the O8U 
Select Table: as 


> 


Comparison of Annual Premiums for the Assurance of 100 
(8 per-cent interest.) 


If we compare, as is most convenient, either annuity or 
premium-values, we shall find that the effect of ignoring the 
element of selection and treating the mortality rates as 
a function of the age alone is that, at the younger entry 
ages, premiums are underestimated and annuity-values are 
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overestimated.* The OM! premiums should, properly speaking, 
be compared with those derived from a table representing the 
true aggregate of the select tables, but no such table is avail- 
able. There is a point, which is in general somewhat greater 
than the average age at entry, at which the two curves 
representing the premium values for the mixed and select 
data cross each other, and for the older ages the premiums by 
mixed tables are greater than those by the select table. The 
extent of the differences in the premiums is sufficient to 
render it necessary, in adopting a basis for assurance 
premiums, to take into account the question of selection. The 
only plan by which the use of select tables can safely be 
avoided, is either by adopting a special form of loading 
or by throwing out altogether from the data upon which the 
premiums are based those years of assurance which are 
seriously affected by selection, that is to say by employing 
a table of the H™® or O™ type. We then obtain a table 
which at all ages overestimates the values of the premiums 
and underestimates the values of annuities. 

A table representing. “ultimate” rates of mortality, that 
’ is, of the JI™® or O™ type, is therefore a safe one to employ 
for the grant of assurances, although not for the grant of 
annuities., There js, indeed, very much to be said for the use 
of a table of that kind for assurance purposes, but, to 
discuss that question, we should have to go into the finance ot 
life assurance valuations, which hardly comes within the 
scope of our subject. 

With a view of avoiding the necessity for select tables, a 
device was adopted by the American offices in their first 
experience denominated the “final series” method. The 
object was to produce a table not entirely unaffected by 
selection, but in which its influence would be reduced to a 
minimum ; a table of mortality similar to that which might be 
supposed to prevail in an office of great age doing a uniform 
and steady new business. To produce that result the lives 


* This is shown, in the table above, to be the case both with the H™ and OM 
Tables. Unfortunately, however, in neither case is the comparison very 
satisfactory. Dr. Sprague’s HM premiums from the method of their calculation 
are probably somewhat higher than the true values, and in the case of the 
O™ Table we are comparing select premiums based in part upon the aggregate of 
the select tables, excluding first ten years from entry, with OM premiums based 
upon an aggregate table from which there had been a further elimination 
of duplicate assurances. 
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existing at the close of the observations were traced out 
through a hypothetical future in which they were assumed to 
be subject to rates of mortality and lapse identical with 
the rates actually observed in the past among lives of similar 
age and duration. The minor details of the process we may 
pass over. The result from a financial point of view is that 
the premiums are still underestimated for the younger 
insuring ages, although not to the same extent as in a table of 
the H™ type, and are overestimated at the older ages, the 
point at which the values cross the true curve being earlier 
than would have been the case had the “final series” 
adjustment not been used. There are some practical 
difficulties in adopting a method of this kind. One of these 
is that after some 15 or 20 years’ duration the observed rates 
of mortality for individual ages and years of assurance 
depend on a very few facts. We then have to apply 
the very irregular rates resulting from those few facts 
to much larger numbers, including the existing lives that 
have been brought back hypothetically under observation ; 
so that where these irregularities become inconveniently 
large, the application of the method must ceag2; or else 
these irregular rates must be subjected to some process of 
graduation before being used in the calcutations. > 

This difficulty could be met by using a species of 
O*) or OM) Table for risks of 15 or 20 years’ duration and 
upwards, instead of the rates of mortality deduced from 
individual years of assurance. There are, however, other 
objections to this method as an expedient for counteracting 
the effect of a too short average duration of assurance. 

As the rate of mortality amongst assured lives cannot 
‘strictly be treated as a function of age alone, but is also 
dependent upon the duration of assurance, so the rates of 
sickness in a Friendly Society, or of re-marriage in a 
Widow’s Fund, are affected, respectively, by the duration 
of membership, or of widowhood. Snfficiently approximate 
results may, however, be generally arrived at in these cases 
by treating the rate of sickness, or of re-marriage, as a 
function of the age alone: in the former case because 
the effect of selection is not very great and is soon exhausted, 
in the latter case because the average constitution, as regards | 
the duration of widowhood, of a group of lives passing under 
observation at a given age will be found to remain fairly 
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constant (unless the Pension Fund is of recent establishment) 
and the financial effect of a marriage when it occurs is a 
function of the age only. 

Where, however, we are dealing with rates of dis- 
continuance or lapse, it is important that these should be 
analyzed both as respects age and duration. Owing to the 
fact that the financial effect of a discontinuance is mainly 
dependent upon the duration of assurance, very erroneous 
conclusions may be deduced by treating the rates as functions 
of the age alone as has sometimes been done. If this course 
is adopted special precautions must be taken, such, for example, 
as deducing the rates from a body of lives representing the 
“ existing ” some 10 or 20 years back, and excluding from the 
“ exposed to risk” all more recent entrants, as proposed by 
Mr. A. W. Watson (J.I.A., xxxv, 313-4). 


SIXTH LECTURE. 


ly the concluding Lecture we shall deal with some 
miscellaneous points of general interest or arising out of the 
previous Lectures. We have already dealt with the nature 
of the modifications of Makeham’s formula for the force of 
mortality, necessary to enable us to represent satisfactorily 
the mortality shown by select tables such as the OM, 
These modifications consisted in treating the quantities A 
and B or a and 8 in the formulas 


Majte=At+B.c***; cologipei=at+P8.cPt 


which are constants as regards the variable x, as functions of 
t the time elapsed since the date of selection. 

It is clear that a similar course may be pursued if any 
other formula than Makeham’s is employed in the graduation 
of the “ultimate” table. Thus we may write 


Majot=Aet+ By. pose 


where A; and B; will in general be such functions that, as ¢ 
reaches a certain value, at which the select and ultimate 
mortality rates merge, A; becomes zero and B; unity. The 
form of these expressions employed for representing the 
effect of selection suggests that a similar form may be 
employed for representing rate of discontinuance, which in 
general may be taken to be a function of the duration of 
assurance and of the age at entry. The same remark applies 
to such a function as the rate of remarriage amongst widows, 


which is, similarly, a function of the duration of widowhood 
and of the age. 


87 


Although we have dealt at considerable length with the 
use of Makeham’s formula in connection with mortality 
tables, there are some further remarks to be made as to its 
employment in certain special cases, more particularly in 
connection with the age statistics at a Census. 

If we suppose a population which is (1) subject to uniform 
rates of mortality, corresponding at the adult ages to 
Makeham’s formula, (2) such that the numbers living 
represent the survivors from a number of births increasing 
annually in a geometrical progression, and (8) is subject 
to a rate of emigration or immigration uniform at all 
ages, then if J’, represent the numbers in the population, 
at a given moment of time, passing through the exact age z, 
obviously the curve of l’, will follow Makeham’s formula, 
and if we write 


—l'» 


too —— =(A+r)+B.c7 
U's 


we shall have a formula similar to the usual formula for the 
force of mortality, but with the constant A increased by r, 
the rate per annum at which the population is increasing ; 
that is to say, the “natural” rate of increase less the rate 
of emigration. It is true that hardly any population 
will be found to conform very closely to the above 
assumptions, but nevertheless it will be frequently found 
that the population curve for the adult ages does conform 
to Makeham’s formula for /J,, although in most cases it 
will be necessary to adopt Makeham’s second development 
of Gompertz, with the additional constant in the expression 
for ber- ; 

If the population is given, as is usual, for decennial age 
groups (e.g., 15-25, 25-35, 35-45, &c.), the values of the 
ordinate for the middle age of each group may be 
obtained with sufficient approximation by deducting from 
each term wz, of the series representing the numbers in 
successive age groups one twenty-fourth of the central 
second difference 


A*u, om 


24 
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From the values of J’, thus obtained, by writing 
log Vz=K+s.a+9.c%, 
or, log ’,=K+s.a+h.v+ 9.c? 


as the case may be, the constants may be determined as for a 
mortality table. 

Take, for example, the male population of England and 
Wales, enumerated at the Census of 1901, as under :— 


TaBLE XII. 
Male Population in Age-groups: England and Wales, 1901. 


Central 
2 x| Ordinate é Col. (4 
esap | Gn). |, — 2°23 log (8) | Alog (8) | 4#10g (3) | a*log (8)} adjusted 
Saas 
@) «) ) (6) ) cc) 
76,873 | 4°8829 | _ 109g 488349 
59,871 | 4°7736 | ~ 3435 | —0822 | _ o1gg | 4°77301 
42,863 | 46321 | ~ 13-2 | --0460 | — o.o= | 4°63269 
27,838 | 4°4446 | ~ 5999 | —'0945 | — oggr | 4°44401 
14,541 | 41626 | ~ G25 | —1932 4°16319 
4,868 | 3-6874 _ A < | 3°68681 


*To reduce the magnitude of these numbers, the figures used are those 
corresponding to a total population (M & F) of 1,000,000 as given in the Census 


Report. This, of course, does not affect their relative value nor the form of the 
curve. 


Here, evidently, Col. (6) cannot be well represented by a 
Geometrical Progression, but with Col. (7) this is possible 
without very serious changes in the values. This would give 


a formula corresponding to Makeham’s second modification of 
Gompertz, viz., 


log U'n=K+Aa+ A’.a®+B.c7 


for the values of the logs of the numbers living at age a, 
given in Col. (4). As these numbers are only approximate, 
and our object is merely to show the applicability of the 
formula as a base line, we may adopt a very simple method 
of determining the constants, similar to that used by 
Mr. Makeham in his paper on the Law of Mortality (J.I.A., 
xiii, p. 838 et seq.). If the terms in Col. (4) are alternately 
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diminished and increased by a quantity z, the quantities in 
Col. 7 will become 


—0138 + 8z 
—°0485 — 82 
—°0987 + 82 
These terms can obviously be made to form a geometrical 
progression by suitably determining z, and their common 
ratio, found by dividing the sum of the second and third 
terms by the sum of the first and second, will be equal to 
1472 
Dividing the sum of the first two terms: by 3°363 we get 


—‘062 ' 
sha = —°01853 as the adjusted first term, giving 


8z= —47°3 and z=—59. Hence the transformed series for 
Col. (4) is as shown in Col. (8), where the progression 
accurately follows Makeham’s second development. 


It is on the whole more convenient to deal with the 
numbers living above age a rather than the numbers for 
the decennial age groups. 

If we treat fhe numbers in Table XII in this manner, 
representing the numbers living above age x by the 
expression 


7 log Q2=K + ma* + nb* 


we shall have the results set out in the following table, where 
the values of the constants have been determined by ignoring 
the extreme values of log Q; at ages 15 and 85, and equating 
the sums of the values of the above expression to the values 
of (log Qs+log Qs), (log Qss+log Qis), &c., by which means 
‘we obtain for the values of the constants 
log a=°006420 (ma*5) = —1:0582 
log b =035184 (nb) = — -007933 
K=6°4222 
The five figure logarithms of Q, were employed in the 
calculation, but, owing to the nature of the process, the fifth 
figure in the graduated column cannot then be relied upon ; 
the logs have therefore been throughout cut, down to four 
‘figures in the table, which is quite sufficient for the purpose 
of illustration. 


| 
i | i log Q's 
i Ta 2 eee 
| Age Le Mise bers? | log Qz Alog Qz = slog Qi. ——— 
| az! Q: K + mat + nb* ah = 
| Peewie.6 Chem ic Te Tee) eee” id 
| i ae ga a ; 
| 15 21,672 «55074 - | 5°5059 ke | 

; ie ° — 
226,979  5°8560 | | 88561 | 0001 
2 eg 1S —'1785 | 
| 35 | 150,554 © 51777 A tm 51776 are | 
' i —_* | * _ : 
| 49598 | 49599 0001 
Bee ag ia — 2764 | nen 2766 ar 
| 55 ; 48,236 ' 46834 | | ge ae : 
| ee mek — +3754 | ate _-3753 
65 : 20,823 : 43080 | . 

| : | — 5573 — "5574, 
i 95 , 5,682 ' 37507 Bye 3-7506 ae 0001 
i ; | i ! —°92 
| 85 552.4) 27419") | 2°8288 “0869 
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Taste XIII. 
Male Population living above the undermentioned ages.—England 
and Wales, 1901. 
(Based upon figures in preceding Table.) 


| | 


*0015 


log Q’,— log Qz | 


| 
| 


The practical identity of the curves at all ages“except 15 
and 85, which values were not used in determining the 
constants, suggests that very accurate Tesults ‘might be 
obtained by making use of a curve of the above form for 
interpolation of intermediate values of Q-. 


It has been proposed to employ Makeham’s formula to 
represent the curve of sickness rates at successive ages, and 
this has been done with a certain degrée of success, but the 
practical advantages of the formula as applied to sickness 
rates are not very apparent, as it is usually necessary to 
know not merely the total sickness rate at each age but its 
division into sickness of various durations, as the number of 
weeks per annum during the first six months of illness, from 
the sixth to the twelfth month, after the twelfth month, 
&c, As Makeham shows (J.I.A. xvi, 414), the ratio 

Weeks sickness experienced in the year of age 
Exposed to risk in middle of year of age 
is not a function similar to Hz but to gz, since it has a definite 
limit, namely, 52, or 1 if the sickness is expressed in years in 


lieu of weeks.” Hence if we represent the above ratio by the 
symbol s,, we should write 


log (52—s,)=A+B.c?. 


9] 


Where, by the constitution of a society there is no formal 
superannuation, the sickness benefit continuing throughout 
life, it is almost invariably the practice of actuaries in using 
Sickness Tables for the purpose of computing contributions 
or valuing benefits to assume that the so-called “sickness” 
will become chronic after a certain age, 70,75, or 80. In 
such cases, as the rates of sickness actually employed will 
generally be much below the maximum of 52 weeks, we may 
use log (N—s,)=A+Be*. The value of N must be determined . 
by trial. 

Mr. King has given an example in the graduation of the 
values in the Text-book mortality table at the youngest ages 
of a further application of Makeham’s formula, the term Bc* 
in the expression for the force of Mortality representing, of 
course, equally well an increasing rate of mortality as in adult 
life or a diminishing rate as in infancy and childhood. 


In the common case of an asymmetrical serics the terms 
of which become zero, or very nearly so, at each end, the 
following method of employing the “normal” frequency 
“curve to yepresent the series will often be found convenient 
and effective, particularly if the data are presented in the 
form of afew groups. Let the successive ordinates of the 
curve be represented by the equation y=f(#); we shall 
assume the total area of the curve to be unity and the area of 
curve between the limits z=oo and w=t will be fPydx. Let 


us write 
r-] 1 2 
Y=| d =—-| -tdt 
sil lanes a ae 9 
1 [tee 
so that Y,=1= —z | dt 


where z is a function of ¢, the form of which is to be 
determined by the data. For most purposes it will be 
sufficient to treat z as a parabolic fonction of t, but it will be 
seen later that there are certain cases in which a different 
hypothesis as to the form of the function z is to be preferred. 

An example will make plain the method of proceeding. 
Take the O™ data as summarized on p. viii of the volume 
of Unadjusted Data (Whole-life, Males). In the last two 
columns of the table there is given the ‘ proportionate 
distribution per-cent” of the exposed to risk and died. 
Taking the figures there given we obtain the following tables. 
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; | eis wae 
The values of z are found by entering a table of =, e-"dt 


Zz 


for + and — arguments with the values in the second 
columns of Tables (XIV) and (XV). We may employ a table 
such as that given by Woolhouse (J.I.4., vol. xvii, p. 50) or 
that given on pages 138, 139, at the end of these lectures. Note, 
however, that in each of these tables the function tabulated 


EN sz | e-Pat say I,, for + arguments only, so that the total 
, 0 ; . . 
area of the curve from —o to +0 is 2 instead of 1. Hence, 


if Y;is >4 we must put 


+ . =-|' edt 


1 
—-” 2 TJ0 


so that z takes the value corresponding to the tabular value 
I,=2Y;—1. Similarly, if Y; is <} we putz negative and 
numerically equal to the argument, giving I,=1~— 2Y;. 


Tante XIV. 


Proportion 
-Exposed to Risk 
Age above age ¢ Values 


* Theoretically the values of s corresponding to a total frequency of 1 an 
are respectively +0. As however z= 43) corresponds *: Y —"999989 if 
ie z= +3°5 to ¥ ="99999963 or ‘00000087, and z= +44 to Y="9999999092 
a pal ape Will be seen that any value of z over 3 will sufficiently represent 

; “ complete distribution or the zero value, and in practice it would be quite 
sufficient to insert at the ends of the table any convenient value of z over 3 
‘and consistent with the general run of the intervening terms, : 


vo 


Tanur XV. 


O™ Data. Deaths. 


a = | 


.s See note at foot of Table XIV on preceding page. It is to be noted, that in 

lieu of the integral of the normal frequency function, the function e?/(1 +e?) 
may = used, leading to a method of procedure similar to that referred to 
on p. 51. 


e 

The column containing the mean error or standard 
deviation®of z in the table of deaths is computed as follows. 
If the total of the series (in this case the total deaths) is n, 
and the total above a given point (in this case the number of 
deaths above age ¢) is m, then the mean error in m is equal 
to elem te) . From this can be calculated the mean 
errors of the values in column (2). The change in the value 
of z corresponding to a given change in the values of 
| e-“dt in column (2) being known from the table of this 
function we obtain the values in column (4). These standard 
deviations are not inserted in the table of Exposed to Risk, as 
the principle upon which the mean errors in the proportionate 
distribution of the deaths are computed is not strictly 
applicable to the table of Exposed to Risk, when the latter 
represent observations spread over a long and continuous 
period, although it would be applicable if the numbers dealt 
with represented the exposures in a single calendar year. 

If we examine the columns of the successve differences 
of z in the two tables, ignoring the infinite values of z 


Proportion of Mean YP ip ick aes 
Deaths Error of | 

above age t Say z 

° in last s 2 
Sie a We aig an cia rh ria lied 
decimals 

1:00000 co * 

1:00000 co * 
99925 2°2450 
97565 13030 | x 4g |— 8511 3190 , 
88854 8618 | + 28 Sp -1999 |~"1900 | onan 
74174 4587 | + 24 |—'2081 | -oto7 |— 1183 | o4og "0812 
53731 0663 | + 23 | 4595 |— 0671 — 0778 0190 :— 0285 
28908 — 3932} + 24 | 3994 |— 1329 — 0658 | _ 6039 ! "0159 
08169 — -9856 | + 32 |~ 2924 |_ 2096 | — "0697 
00590 —1-7806 | + 82 |~ 7950 
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corresponding to a total distribution of unity we shall see 
that they exhibit a remarkable similarity in the nature of 
their progression, especially from the columns AX onwards. 
It will also be apparent that a very small alteration of the 
original values of z in either table would be needed to make 
the fifth differences constant; that is, we may assume without 
serious error that 

t.(t—1) 


B + &c. 


z=at+bt+c. 


In order to obtain the closest agreement with the 
original facts due regard would have to be taken of the 
weights corresponding to the mean errors in the value of 
z as given in the table. But we shall obtain results quite 
good enough for all purposes by the following simple 
procedure. It will be observed from the values of the 
mean errors that the values of z for ages 40 to 70 have 
approximately the same weight, those for ages 30 and 80 
have somewhat less weight and finally those for ages 20 
and 90 much less. 

If we combine the values of z in sets, thus, 


Zag + 3259+ 2105 Zot+3Z +20; &c., 


with their corresponding numerical values we shall obtain 
six equations to determine the six coefficients, a, b, c,... ¥. 
Into these equations the values z» and zo will enter once, 
the values z and zg four times, and the remaining values 
five times. We need not compute the numerical values 
for all these equations as it will be evident that if we 
write them down and difference them we shall arrive at 
the following : 


Sat+5b+c= 2+ 82942 = 7°2885) 
5b+5c+d=A (zo + 8259 +249) = —2°8505 
So+5d +e =A%( 2a + Bz99 + 240) = "7167 


Sd+ 5e +f =A%(e +8259 + 20) = — -6227 
de+ of = AS (22 + 8259 + 240) = *2052 
Sf = A$ (zZo9 + 3299 + 240) =— ‘1326 


From these equations the values of f, e, d, &e., can be 
obtained ‘with great facility. 
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Having obtained a formula for z in terms of t, we can now 
obtain any term in the series and can also obtain the 
value of y, the ordinate representing the number of deaths at 
age x (i.¢., approximately between ages x—4 and a+}) since 

dz 


Boke a ae 
I= 3 dt 


dz 1 ie ] 
ate gy et 3 har— 4 det © Ate,, 


and 


It will generaily be sufficient to compute the values of y 
for decennial or at most quinquennial intervals and to 
interpolate the resulting values of gx or m, for the inter- 
mediate ages. 

The values of the quantities a, l, c, &c., satisfying the 
above equations, are 


a= 224374 d= —-186796 
b= — *849365 e= -067560 
c= “316624 f= —026520 


It ne be of interest to give the adjusted valucs of ¢ and 
the distribution of deaths corresponding to these which are as 
under : 


Tanie XVI. 
O™ Data. Deaths. 
Adjusted values of z and adjusted distribution of Deaths. 


ti 3 Last column more (+) or 

Age s ay j e~ Pal less (—) than corresponding 

VT! wn column in Table (XY). 

+ | - 
0 613645 100000 * | 

10 3°69061]. 1-Q0000 ; 
20 2°24:374 "99925 re 
30 139438 "97569 *0U004 te 
40 “86166 “888.48 re *G0006 
50 “45872 “TALTA “00000 ne \ 
60 “06640 “33740 *QOU09 = 
70 — *39352 “28893 a “00015 
80 — ‘98473 *U8188 “00019 ses 
90 —1°78289 “00585 a “00005 


100 ~2-90220 “UOUU2 aes 
Te Fi I Soe aetna Nance tt RE ETA te 
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The principal objection to this adjustment, paradoxical as it 
may sound, is that it too closely follows the original facts, the 
deviations being very much smaller than the probable errors 
of the observations. This is, of course, due to the fact that 
we have included too many constants in our formula. A 
constant fourth difference in the values of z, however, may 
lead to anomalous results, and a constant third difference 
makes the errors of adjustment too great. The best plan in 
such a case would be to adjust the exposures by- using a 
constant third difference, to recompute the deaths to 
correspond to the adjusted exposures in the 10 year groups 
and then employ a constant third difference for the graduation 
of the death curve. Or, as an alternative, an expression for 
z may be assumed of the form 


and the values of k, m, n, a, b, determined by weighting the 
equations in a manner similar to that shown above for the 
fifth difference curve. 

We have used the O™ data to illustrate the above process, 
but generally speaking the latter will be found more useful 
where the data are only available in large groups, and, in 
particular, where the limits of the series are not well defined. 

In the following table we have a statement taken from 
Supplement to the Registrar-General’s 45th Annual Report, 
p. cxviil, showing the number of Innkeepers, &c., living at 
or over certain given ages. 


Taste XVII. 
Innkeepers, Publicans, &e. (1881). 


| | Seopa sonal 
Living num 
A 
| ges | above age ~ +? [iervee biciey of 
ot ft 
| 15 | 232,890 / 1-0000 co 
| 30 230,280 ‘9888 1°6147 
| 25 | 2299913 9542 1:1929 
65 


45 | 105,153 4515 — °0862 


14,451 0620 —1:0877 4 


At will be seen that more than 50 per-cent of the numbers 
living are in the age-group 25-45, and nearly 40 per-cent in 
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the group 45-65. In such a series the usual methods of 
interpolation would probably give unsatisfactory results. 

If we treat the values of z as having constant third 
differences, we obtain the following equations, taking five 
years of age as the unit— 


a =1-6147 
a+b eee SL Oze 
a+5b+10c+10d=—-0862 
a+9b+36c+84d= —1-0877 


d, c, and d are the values, reckoning from age 20, of the 
differences of z. The values of a and b are given immediately 
and solving the remaining equations for c and d we obtain— 


gee HUET fon eC 
b= —-4218 d= —-00782 


which enable us to form at once the following series of 
quinquenrial age groups. 


> ° Taste XVIII. 
pecaret the Publicans,. _Publicans, §e. (1881 Coney 


Corresponding 


| Proportional 
Age | crt ge | PN mags a | Population vetween 
t ox 
= Fee “e- alah he (+5) 
! | RE EES | 
| | | 
1s | 270930 -9985 97 
20 «| 1°6147 9888 346 
25 | 11929 | “9542 174 
30 | ‘8197 “8768 1221 | 
35 4874 “7547 1499 
40 1880 “6048 1533 
45 — -0862 “4515 1376 
50 — -3429 “3139 1119 
55. | — *5904 -2020 834 
- 60 — +8360 ‘1186 566 
65 —1-0876 -0620 342 
70 —1°3534 -0278 176°7 
75 —1°6407 ‘01017 73°5 
80 —1-9577 ‘00281 225 | 
85 ~ 23121 00053 47} 
90 |° —2-7117- -00006 6 | 
=. é . | 


oa ib seka tees the editions living above age-z out of a total population of -1.. 
B 
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is distribution shows a small number 

‘ll be seen that this distribution sh a smal 
f jet er age 15. This may be avoided if it is desired 
2 commence the curve at and not before that age, by 


writing 
m 
pegs Wa bt+cé? 
Erase Vier i =f 
giving Lele Sig aie. 


di (i—15): 


the term “5 being introduced in order to give the high 


values of z required near the origin, or, we may write as 
i i ith the O* data 
suggested above, in connection wi - 


the valuc of a being taken in this case as equal to —15. 

This form for the value of z will be found very convenient 
where the series is known to be limited in either direction and 
the number of groups is small. In certain cases either 
a or b may be known, and we have, then, only four constants, 
m,n, k, and 6 or a, to determine, for which four groups will 
suffice. Or it may be convenient to assume values for both 
a and 6, in which case with four groups we may write 


a, 0 Ges eee 


determining m, n, k and c from the data. 


In the case of any statistics intended to he used by the 
actuary, it is important to consider not only how far they are 
suitable for the purpose for which they are to be employed, 
but also whether the data are sufficient to render the 
conclusions drawn from them safe. We have already referred 
to this question in general terms, but. it is necessary to. 
consider it rather more closely. 

In practice the actuary has to deal either, (1), with tables. 
based upon a large number of observations 3 for example,. 
tables such as the O™, the Government Annuitants, the 
Manchester Unity Tables of Sickness, &c., where the 
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accidental errors due to the limited numbers are practically 
insignificant, but where, on the other hand, there may be 
uncertainty as to the suitability of the experience for the 
case in hand; or, (2), with data of more limited extent but 
known to be applicable, as in the valuation of a pension 
fund of a Friendly Society by tables based upon its own 
experience. 

In the latter case it is important to be able to form some 
judgment as to the extent of the probable errors involved in 
the use of the data and their effect upon the financial values 
deduced therefrom. This is a problem not susceptible of an 
exact solution. It is true that if the series of numbers 
representing the deaths, marriages, or retirements, as the 
case may be, can be represented by a frequency curve, the 
probable error of the constants may be obtained in the manner 
shown by Professor Karl Pearson in his paper on this subject. 
But these results will be little practical use to us, as 
the manner in which these probable errors, which are not: 
independent, will affect the monetary values deduced from 
the graduated rates is too complicated. We can only deal 
- with the problem in a very general manner. We are not 
even sure that the ordinary theory of errors is applicable to 
such functtons as tates of mortality, sickness, or superan- 
nuation ; indeed, we may well suspect that it is not strictly 
applicable. 

If the probability of throwing head at a single toss of a coin 
is one-half, and if in 100 throws 54 heads appear to 46 tails, we 
do not suppose that the probability of the average number of 
50 heads appearing in the next 100 throws is affected. But 
in the case of the probabilities of death it may well be that: 
an abnormally high or low rate of mortality in a given year 
may affect the probable rate in succeeding years, and that 
there may be a tendency for the deviations from the average 
result to correct themselves, a low rate in a given year 
leaving a larger number, and a high rate a smaller number, 
of impaired lives surviving, and thus changing for the time 
being the constitution of the group under observation. 

The “standard deviation” in the value of a, as deduced 
from a given experience has not, that I am aware of, been 
estimated. It will be instructive to attempt this, as an 
example, for the O™® table. It will be sufficient to use 
approximate methods, as the results will be quite accurate 


H 2 
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enough for our purpose. We shall assume that we may take 


colog .p=m= —i- 


and that if the standard deviation in log y=c, then the. 
standard deviation in y=oy.* 
Taking the observations at a given age a, let us put 


exposed to risk =n 


graduated or “ true” rate of mortality =q. 


graduated deaths =n =0 
actual deaths =ng+z2=0' 
observed value of ¢ =q =q+-, 


where, as we have seen, the average value of z is zero, the 
average value of z22=nq(l—q), &e. (see p. 110). 
Then the observed value of m=m’ where 


Pat. Clee eh Ree (terms in powers of z) 


m = 

u— a 1— . é 
=m+ f(z), say 
=colog.p + fle) 


It will be found that the average value of f(z) is not quite 
‘. 1 1 
though 1 — + - 
ge aug very nearly so, being equal to m( 5 + Pe 
nearly, a quantity that may be neglected; and that the 


2 
average value of [f(z)]? is a very nearly, and 


SS m 

V average value of _= =. 

g : e of [ f(z) ]?= J ng 
Hence, the standard deviation in the “ central ” death (or 

marriage or Secession or.any similar) rate is very nearly equal 

to the rate divided by the square root of the number of deaths 

(marriages or secessions, &c.). The errors in logep are of: 


*If logyy have the small error ¢, y ‘will be changed to cosevtomy, ot 
=y(I+o+ ...), te, the corresponding error in y will be cy nearly, 
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course the same, but of opposite sign to those in cologep. 
Let the observed value of log.px be log.p’s. We will write 


logep’2=logeprt+tz 


where u, is the error of observation whose value in a particular 
case is fixed but unknown, the average value over a long 
series of similar observations being zero, and the average 
(cologepr) (mz) 
‘NQz NQx 
graduated number of deaths at age z. . 

Taking an arbitrary radix for our mortality table, say l;; 
the values of log 1,,¢ for ages above z will be : 


value of w*, being ; where ng is the 


log -l’,=log.lz 


log el’ 241 =logelss +Uz 


logel'x4e=logelest+ (Wet Urgit...-+Urst-1) 
' similarly,*ve shall have 
- . kbgD’:=logD, 
and for higher ages 
logs D’r42=loge Deg et (tr ttrgit--.+Ue4t-1) 


whence, on the principle of approximation laid down above, 


, 
D eat D4 


my aD Fe t+ Ue tUgit...Ux4t-1) 
z 


Summing this for all values of ¢ from 1 to infinity, we shall 
have 


a= “2 = [Not (ue-Na+ Ue41- Negi t Urge-Ne4o+,&c.)]+D, 
zx 


Here the quantity in the bracket in the numerator is the 
error in the value of N’, as deduced from the observations in 
relation to the value of D’, corresponding to the arbitrary 
radix assumed at that age. The average value of each term 
in the bracket is zero, and the square root of the sum of the 
average values of the squares of these terms divided by D, 


102 


will give the standard deviation in the value of a’, as deduced 
from the data, which, omitting the suffix 2, becomes 


1 ee ae ee ee Bete Fn eee 
D, ATONE + u,2N ,? + Ug? N .? oo 3 &e. 


If the mortality table be graduated the standard deviation of 
the graduated values of a’, will be somewhat less than that 
of the ungraduated values, but not materially less, except at 
the ends of the table, the principal effect of the graduation 
beimg merely to produce a smooth progression in values. 

We might assume, for example, that the effect of graduation 
was about equivalent to substituting the average error of five 
successive values of N’, for the error of the middle value. 
This would give (omitting a quite insignificant term) the 
expression 


ah [te (8N2+No41t+Ne+2) a Urei(4No41 + Ness) 


+ Uzs2-SNayetUzys-ONzast, &c.] 
for the error in the graduated value of a’, in lieu of the 
expression given above. 

If we shorten the expression for the standard deviation 
of a’, from i 


py ie emer Oa A 
DY Ug N+ uN,?+ uzZNr+, &e. ¢ 
to its approximate equivalent 


| ea: ten nee 
D, Vv out NP+ 5u,?..N7? + Sto”. Ny? +, &c., 
and, further, take 


cut ——__25lcologa(me)J® 
Observed deaths between # and (#+5) 
we shall considerably shorten the labour of calculation, and 
at the same time, by slightly underestimating the required 
value, make a rough allowance for the effect of graduation. 
We are now in a position to compute a table of standard 
deviations for a, for quinquennial intervals of age, the 
principal steps of the working being set out in the table 
following. The final columns showing the mean errors or 
standard deviations in the value of a, and the corresponding 
mean errors in P, found by dividing the former results by the 
quantity (1+ a,)*.* ; 
* s @z have ay error oy, then Py will have the error 
(eee > 1 1 1 
(ia 2) irre: ox -d)= I+a,  Itaz+on faa . 
=o, (1 +a,)? nearly, 
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Computation of the standard deviations cy in the deduced values 
of az and oz + (1+ 4a,)? tn the deduced values of Pr. 


i Deaths 

' gel between a Pig teen . 1102 col. (6) 

! | a5teologepsss ha Burrz. x 10# Sur yoN2.. um of 10?’ col. (6) Ox 

| Age 3 ologeperaP yee =1052)+(8) x10 . last Seinrais Pe De +a) 

i 2+5 Bee 

i 

£ eatin? gent! oe oe Bh oe aid me pe CR SE ae BNO 
qa) (2) @) (4) () (6) ar lan) 

as | "1020 10} 1:020 | 20244- 21570: -2200 | -00038 

i 20 | 1113 122 -0912 1163° 1326- 0653 | -00012 
25 | 1266 924 -01370 | 1100 162-5 -0274 | -00006 
30 ! 1525 | 8,072 -004966 24°48 52°52 0187 | -00004 
35 |: 1981 | 5,689 -003482 10-20 28-04 0165 | -00004 
40: 2813 | 8,152 003451 5°758 17°84 0159 | -00005 
45 | 4410 | 10,257 “004295 3864 12:08 - | 0160 | -00006 : 
50 | 7632 | 12,620 -006048 2-726 8-215 0164 | -00007 : 
55 1444 | 14,903 -009694 1986 5489 0169 | -00010. 
60: 2945 |16,618 01772 1-445 3503 0177 | -00014 
65 | 6359 _~—«|:117,455 03644 -9770 2059 0187 | -00021 
70: 1432 16,042 -08929 -6052 1-082 0203 | -00033 
75: 33-20 12,172 +2728 “3185 “4764 -0228 | -00059 
80: 7851 7,317 1-073 *1227 “1580 0272 | 00116 
85; 1881 2,865 6°566 03151 03528 | -0364 | -00267 
90 ' 4546 692} 65°71 -003659 003775 | °0550 | -00705 
95 ! ‘02146 


1105- 86 | 1285- -000118 a ares “0966 


The rgsult we have arrived at shows that the mean error, 
or standard deviation, in the values of the 3 per-cent 
Annuities in an aggregate experience such as the O™™ is 
about one-fiftieth of a year’s purchase from about 30 to 65 
years of age. Owing to the greater number of deaths at the 
younger ages in the O™ experience this would about represent 
standard deviations for that Table from 25 to 65. 

If we suppose an experience in which the data were 
one-hundredth of the extent of the O™ but similarly 
distributed, it is obvious, from a consideration of the process 
by which the above result was obtained, that the standard 
deviations or mean errors in the annuity-values would be 
ten times greater than the values found above. Hence, with 
an experience including about 1,000 deaths distributed 
approximately as in the O™) data the deduced annuity-values 
between ages 30 and 60 would on the average be uncertain 
to about +°20, or from 1 per-cent to 14 per-cent of their 
values. The standard deviations above obtained would be 
somewhat reduced in a small experience by graduating the 
experience by Makeham or by a suitable frequency curve, 

but not very materially. It would occupy too much time 
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to investigate this point, but we may easily find a limit to the 
effect of any possible method of graduation in reducing 
the standard deviations of the annuities. In any ordinary 
experience such as the O™, where the observed deaths are a 
small fraction of the lives passing under observation, the 
errors in the annuity-values will be due, 1°, to the mortality 
on the whole being above or below normal, 2°, to the 
distribution of the mortality being abnormal. This latter: 
factor can alone be affected by any method of graduation. 
Assume it to disappear altogether, and consider the standard 
déviation for say as (O™* 3 per-cent) obtained on this 
hypothesis; There were approximately 100,000 deaths 
observed above age 50 in this experience. We have 
/100,000=816 nearly, and if we assume the mortality 
above 50 to be throughout subject to an error of + 316 of 
the observed amount, this will be equivalent to changes . 


A i. 
=a wee rr iF 
of + 316 and + 31g ™ the values of the constants A and B 


"respectively, which, taking the value of A=-00589 and 
log c='039, are equivalent in their effect upon the annuity- 
value to a change of 00186 in the rate of interest per-cent 
and of -0341 years in the age. The combined effect of these 
changes upon the annuity-value at age 50 is equivalent to 
+°0148 as. compared with the standard deviation. of -0185. 
obtained above. The very considerable standard deviations 
at the ends of the table would, however, be reduced in much 
greater proportion. 

The problem dealt with above is not the same as that of 
determining the standard deviation in the estimated value of 
an annuity on a single life. This problem, which is also of 
importance, has been dealt with by Dr. Bremiker in his paper 
“On the Risk Attaching to the grant of Life Assurances ” 
(JIA. xvi, pp. 216, 285). As this paper is not very available 
for students and the notation is not modern, it may be worth 
while to give the following short demonstration. For the 
sake of simplicity “continuous” functions are used. 

If the annuitant, aged #.at entry, die at the end of the 
time ¢ the loss to the company granting the annuity, or the 
deviation from its mean value, referred to the date of entry 
will be ‘ 
mi Ag—e7#8 


Gi ln ae gee 


8, 
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and the sum of the squares of all values of this quantity, 
multiplied he the frequency in each case, will be 


Az—e7 - (A)? —2e-#A, 4 e-28 
=) ages 8 ii eps)dt= -|,! Ale Ree ee 5 pad. 


Noting that 
io] a 3 
-| at (epx)dt=1 
|e Fl \dt=A 
0 dg VPslh= Ae 


3 aoe Pas to 
and “ar ee i (px) dt=A x (at rate of interest= e25— 1) 


we obtain from the above, as the value of the standard 
deviation of d,, and therefore with sufficient accuracy for 


practical purposes of a, (=a@.— 


1 : 
5 nearly) the expression 


1 ae re 27 2 
oe [A’e— (Az)*]* 


the first térm in the bracket being computed at the rate of 
interest ge 1, and the second at the rate e—1. It is obvious 
that the standard deviation for A, will be the above expression 
multiplied by §; and for A, less the capitalised value of the 
annual premiums (Px) (which Dr. Bremiker terms the “ Risk 
attaching to the grant of Life Assurances” by annual 
premiums) the risk will be the above expression multiplied by 
(P.+8). The premium is here supposed to be payable 
continuously ; if an ordinary annual premium is in question, 
we should multiply the above expression for o by (Pz+d). 
The arithmetical values of these “risks” attaching to grant 
of assurances or annuities computed at 4 per-cent, according 
to Heym’s mortality table (General Widows Fund of Berlin) 
are given in the paper referred to, and show, as is obviously 
the case from general considerations, that the “risk”’, ox 
average fluctuation whether profit or loss, attaching to the 
grant of assurances at annual premiums is considerably 
greater than that attaching to their grant at single premiums. 
'- In practice the important question for a life office, in this 
connection—and the same considerations apply to other 
classes of insurance—is the average amount of the annual (or 
quinquennial) fluctuation in profit due to the deviation of the 
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death strain from its average or normal amount. In a 

undly managed office these fluctuations never approach the 
a, at which stability is remotely threatened, but they 
ane of importance when they are sufficient to produce any 

ious variation in the rate of Bonus. ' 
mie: mean square deviation of é, will be found by putting 
5=0 in the expression for o?, which in that case takes the 
indeterminate form 5 which must be evaluated, according to 
the rules of the Differential Calculus, by differentiating 
numerator and denominator. The resulting expression takes 
the same form, so that the process must be repeated, and the 
limiting value of the expression for ¢? when 5=0 will be 
found to be 


LS OC apr rere ae 
L "8=05 | ae e as? (A,)?| 


which may easily be reduced to the form 


1 i : 
= |, Paes (F: |, tates) 


reo 


an = |, Pleve— (2e)® : i 


= [mean square duration— (mean duration)?] 


This being the mean square deviation the standard deviation 
will be 


o= {mean square duration— (mean duration)?]# 


the mean deviation irrespective of sign is approximately 
“7980 and the probable deviation ‘6740, or very nearly 


os and as respectively.* [Of De Morgan, Encycl. Metro- 
5 3 1 y g N's 


politan, Vol. II, p. 460, Art. 149]. If instead of a single risk 


the average of risks be taken, all the above quantities will 
be divided by Wn, 


*The exact values for the mean d 
expectation of life and of the annuity wil 
Where ¢ is in the first instance equal to 
continuous annuity certain Qi=az. 


NOTE A. 


ON THE EVALUATION OF THE SUCCESSIVE MOMENTS OF THE 
BINoMIAL EXPANSION OF (p+ q)”. 


THESE important moments may he found very simply in the 
following manner. The expanded series being 


p+ np” 19+ Seo 1) -g2 4. ~ npg” a +q” 


= Uy Uy +a t 2... Uy -1 tn 
= Suz, where the subscript is identical with the exponent of q, 
2 Pak 4 
the successive moments round the origin will be Yuz, Sxuz, S27uz, 
&c. We yill first find the value of uz, Seuz, Se(e-1)uz, &e. 
We have 


Sue=(p+q)"= tek 


Seve =0 x p™+1x np""1¢+2x ea niae=t) pg +... +g” 
in- de! 
= nd] p*- 1a (n- 1)p™~ 2g? + on 1); pr Sele qe ea 
=ng[p+q)""* =ng. 
Similarly 
Sa(e—1)te= 1x 2 x an 1) pre 42x 3x n(n —1)(n = 3) wns 8 


E 
+... +n2(n—1)q" 
=n(n—- 1)¢*[p"~7 + (n- 2)p"~8q + ee pa} 


= n(n —1)¢°[p +g}? = nln - 1)¢? 
and similarly we shall find . 


Sea — 1)(a— 2)te = n(n — 1)(n — 2)g°, and so on. 


Luz = 1 
ue, = Dele = NG 


Aa — ‘nn —1) 9 
Suga SPE = ned ¢ 


gue Die) “et ain} (n-2) 9 


baie = 


Siyg = SUu— Ile —2)(e— 3), n(n — i — 2)(n — 3) a 
24 24 


by the formule on page59. Hence we have (see the demonstration 
in Note E, page 124), using m,, for the nth moment round the origin 


My= We=1 


mu 
m= LSus=ng 


my= 222+ Due =n(n—1)q? +g 
mg= 63S'ta + 6D te + Sue = n(n — 1)(n - 2)q° nary 
+ 8n(n —1)q? + ng . 
My = 24D tx + 36S ue + 14D ute + Due = n(n — 1)(n — 2)(n — 3)q4 
+ 6n(n —1)(n —2)q3 + 7n(m — 1)g2 + ng 


These last equations may be found directly, by means of successive 


differentiation, according to a method suggested by Bertrand 
(Caleul des Probabilités, Chap. IV, Art. 62). We have 


(p+ q)" =p" +np"—19¢ +4 n(n = 1) yn-a92 


a 


+... +npg""t+ 9" 


5 p+ = [o xp +1 x np"-142 x CE 1) n-2 


al 


gq 
+... t(n- npg"? + ng"-* | 


and a. F(o+4)"=|1 x np” lq 4.9 x MAD guage 5 =” +nq"| 


al 


= 1st moment. 
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Similarly, if we differentiate the last series with respect to q 
+) 


and multiply the result by ¢ (to resto 
: ‘ : re the powe ‘hi 
is lost in the differentiation) we shall hare ese ree 


2 Se me eh ee > 
[1 x np" ~*g +2*x min 1) sg +n'g"| 


= 2nd moment ; and so on, so that 
[tth moment] =¢ - [(¢-1)th moment. ] 


Thus. the first moment 


=ng(p+q)""* 
Second moment 
d n= 9 5 : 
= 93, |nalw +9)" J=n(n-1)g(p+9)"-* + ngip+q)""? 
Third moment . 
=9 - [a(n —1)9*%(p +9)"~? +. ng(p +-9)""*] 
= n(n *1)(n— 2)y*(p +9)" + 2n(n— 1)q°(p+9)"~? 
—$n(n— erp +a)? +ng(p +g) 
- =n(n—1)(m— 2)¢°(p +. q)"~? + 3n(n —1)g°(p +9)? + ng(p +9)? 
Fourth moment ; 
a : 
= lig [third moment] 
=n(n—-1)(n—2)(n— 3)q*(p+9)"~* + 8n(n — 1)(n — 2)q*(p+9)"-* 
+ 3n(n —1)(n— 2)q°(p +g)"~* + 6n(n — 1)9°(p +9)" 
+n(n—1)g(p+9)"~*+ng(p +9)""* 
= n(n —1)(n—2)(n— 3)q(p +9)~* + 6n(n — 1)(n — 2)g°(p +g)" 
+ 7n(n—1)q°(p +9)" +29(p +9)""* 


Putting unity* for all the powers of (p+q), these expressions 
are the same as previously found—see equations A. 
* This may not be done at any earlier stage because thea differentiations are 


with respect to q, aking p constant, whereas to substitute p+q=1 befo 
finishing the differentiations would make p vary with g. ; ; 


110. 


We have thus obtained the moments round the origin. Thence 
the moments round the mean may be found by the formule on 
p. 41. Thus 


fy = 0 

pg = Mg — (my)? = n(n — 1)g? + ng — 27g? = ng — ng" 
=ng(1 — 9) =npq 

bg = Mg — 3M. ta — 14° 


=n(n—1)(n — 2)q? + 8n(n — 1)? + ng 
~ 3ntg? + 3n%g®— akg? 
= ng — 3ng? + 2ng® = ng(1 — 3¢ + 29") 
=ng(1 —9)(1 - 2¢) = mpg(p — q) 
pg = M4 — 41. pg — 647. po — M44 


= ng|(n? — 6n* + 11 — 69° + 6(n — 3n + 2)9? + 7(m—-1)q41 
— 4ng(1 — 3g + 29°) — 6n?g?(1 — g) — n8q°] 


= nq[3(n — 2)q* — 6(n — 2)? + (3n—7)g +1] 
which reduces to ‘ i 
ng(1 —9)[3(m — 2)(1 —9)q +1] 
= npq[3(n — 2)pq +1] 


It is evident that all the even moments must involve p and q 
symmetrically ; while the odd moments will involve a symmetrical 
function of p and gq, together with the factor (p— gq), because they 
must vanish when p=q (i.e, when the curve is symmetrical) and 
must only change sign when p and q are transposed. 


e 


It may be convenient to repeat here the Author’s demonstration 
given, J.I.A., xxvii, 214, of the value of the average deviation from 
the mean irrespective of sign, that is, treating all the deviations as 
positive. 

If we suppose the event to happen m times in the » trials the 
deviation from the mean number np will be (m — np) which, since. 
p+q is always equal to 1, may be put in the form [mg — (n —m)p]. 


lll 


This will be positive or negative as m is > or <np; and the 
probability of this particular deviation will be 


ieee (m “- 1) somgn—m 
in = Ne 


The greatest positive deviation will be ng (when the event 
happens at all the trials); the greatest negative deviation — np 
(when it fails at every trial). 


Hence, we have the following scheme, in which m is to be taken 
as the next integer <p. 


Possible Deviations From Mean Htesult np. 


| 


= ee | Probability | Maguipage x Probability | 
= Ste agiets aa et | 
afeeg 2 Ng | Pp” upng 1 
i |. @-De-p | a ae a(n—1)p—1g°— npg | 
fe ey 
‘ n.(n—1 Bee n(n—1)(n—2) _ 4 
(a—2)q—-2p 2D) pn -tg! | Bam) pn-2g8—n(n 1) pr ae 
\ . 2 ' , i 
ls | | | 
hae : | 
i i 
e m..-(m+1) pmtign- m K i 
— iP Vert y= (n—m—1)p' eee ee pmtign-m-1 in—m — 1 | 
ete ja—e—) ee (m+ 2) mite 2gn-m-1 
jn—m—2 
| 
«< = ..-(m4+1 | 
ame “ he mg — —(n m)p men (MD) mgn—m | a—m pmgn—m+1 
i | m...(m+1) { 
: | | ae Saponees pmtign-m 
S | 
z | | 
| | 
| jee 1-—Dp mpgr-* | mpg —n.(n—~1)p*gn-1 
‘phe np | npg 


Tf the final column of products is examined it will be seen that 
each positive term is cancelled by a similar negative term in 
the succeeding product. Hence, the total of the products, that 
is to say, the average deviation, is zero, showing that np is the true 
mean result, the positive and negative deviations from which 
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exactly balance each other. Of the terms above the’ horizontal 
line, representing the positive deviations, the sum is, of course, 


oe 1 
equal to the only uncancelled term, nm... (m+) pie 


In —m —1 

and similarly of the terms below the line representing the 
: n...(m+1 

negative deviations the sum is -— me (tN) ign _ 


Hence, the average magnitude of the deviations, that is, the total of 
avery possible deviation multiplied by its probability, regardless of 
sign, is ee 
2) nm... (+1) MLN mM __ a _mimM+l n= (a) 
[a=m-1P 8" (mfa=maTP 7 


which, since the sum of all the probabilities is necessarily 1, will 
- also be the average or mean deviation. This result is exact, not 
approximate, but where and m are large numbers it is necessary to 
simplify it by the use of Stirling’s formula, which gives for large 
numbers |n= J 22n"*te-” nearly. 
: Put (a) into the equivalent form 


2|n(n—m) me 


1% — Ne —- 
|m|n—m 


qg 


using Stirling’s approximation to the factorials, we have 


9 
F= ath - On+3)(, im m) -(2=- mM+D(m be m)pe* dine 
aT 


Since m is the integer immediately below np, we may write 
m=np—-k; n—-m=ng+k (where k is a fraction) ; hence, we get 


‘ -np-3+k —ng+h-k 
2 tll (1 Z i) a | ( zy] ATeTN mptl-kongth 
(2 | mp a5 ng\ 1+ - p q 
k\ 
up - 1+-- 
=\/2 2 Znlptl(a - ay ‘ (1+ = m" ng 


but where mp and ng are large numbers, i: being a proper fraction, 
.. the last factor is very nearly equal. to 1, and (a ~ a and 


(1 + A) are very nearly equal to e* and e- * respectively ; hence, 


the above expression reduces to 


eet 
npg = +, 9788 V npg = 5 v. npg very nearly. 
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Although this result has been obtained on the assumption that 
np and mq are large, it will be found to be very approximate 
even for small numbers. As an extreme case, suppose 120 lives at 
risk, the probability of death in each case being -:02; the 
“expected” deaths would then be 2-4, and the extent by which 
the actual deaths would, on the average, exceed or fall short of this 
number would be given by the formula as 


4 See 
: »/9-4 x -98 = 1-227. 


The true value of the average deviation given by formula (a) is 


» 120.119.118 (-02)9(-98)"8 


wae 1-2 
= 1-243, 


almost identical with the approximate result above. 
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NOTE B. 


ON THE USE oF LOGARITHMS OF THE UNADJUSTED TERMS 
OF A SERIES. 


ConsipER the number of cases out of a given series falling into 
a particular group; or the number of deaths, or analogous events, 
at a given age or group of ages, accruing out of a given number at 
risk. Suppose the series to consist of m cases in all, and let the 
true probability of any case falling into the particular group be p, 
and let m=np. Let the observed number of cases in the group be 
m =m-+2, where, as we have seen, z has an average value of zero, 
2 has an average value of 


‘mp1 —p)= 

2 has an average value of 
aia on 

mille omu(n — m) (n m) a 


= 


n(n — m2) . 
ae 


If we operate with the logs of the observed quantities m’', we 
must avoid by arbitrary grouping cases in which m’ is zero, or m:n 
very small when the logs become infinite or very great; but when 
this is done we shall still find the logs of the ungraduated numbers 
less on the average than the values of the graduated (or true) 
numbers. This may be easily seen from a simple example. Let 
n=4, and »=4, in which case m=np=2. The observed values of 


m may be anything from 0 to 4, and we shall have the following 
possible cases : 


Values of Relative Products 
Cee frequency log im’ 
m' =m +The of these values 2 (2) x (3) 


(3) (4) 


—0 
‘ —'097 | (say)—-030 
‘000 
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Here, to avoid the cases in which the observed: value of m’ is 
zero, we have combined the first two groups, taking four cases in 
. , . . ‘ . . 
which m =1, ¥ one case in which m =0, thus giving an average 


value of ree =, the logarithm of which is —-:097. Notwith- 


~~? 


standing this device our average value of log m’ is only -240 as 
compared with the value of log m=-301 (where m=2 is the true 
value or average value of m’). 

Assume that on the average 


log [m’ (1 + 2:)] =log mn 


=log [(m+z)(1+2)] 
i 2 a 12 
=log i+ ot = Ome ah 3m? &e., +h - a + &e. 
Whence 
] P & s + 3 & 
t—-—+ &= —-—+4+—3 = —,, &. 
Bae mm 2m? 3m3?~ 


Insert the average: values as given above for 2, =, &z., 


E 2am . — m)(n — 2 
aE + 8 v-m — (n m)(n = 2m) fae 
= 2am 3n-m- 


or, omitting terms of the second order, 
a a 


n—m 
i; = —— nearly, 
nim 


which, again omitting terms of the second order, may be written 


n—m 
Inn’ 


log [m' (1 + %)] =log [+2 + oe ] =log [we + Le] 


where y’ is the observed value of the probability p. 

If this expression be substituted for log m’ in the example 
given above, we should have as the sum of the products of 
col. (2) x col. (3) the value -309, which is very much nearer the 
true value 301 than the uncorrected value in the above table. If 
we take larger numbers, as »=100, m=np=10, we shall] find by a 


similar process the average value of log ro( m+ : =f ) is ‘99987 as 


compared with the true value of logm=1:00000. Where the 
numbers n and m’ are very large, the correction, of course, becomes 


insignificant. 
re 
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It may be shown in like manner that, if we are dealing with the 
reciprocals of the observed values, then, on the average, 


i bates nearly, 


m+l—p om 


and again, on the average, 


, l-p _ ca 
af + Moles m 


Reverting to the question of the use of the logs of the 
ungraduated quantities, it will be found that if the above results 
are made use of in practice, the logarithms will be over-corrected. 
The reason for this is that we do not eventually arrive at the true 
values of log m and log p, the graduated values being still affected 
by an outstanding or unbalanced error. If our series consists of a 
large number of groups, these outstanding errors will be com- 
paratively small,and the above correction will not be much in excess ; 
but if the number of groups is very small, our graduated quantities 
must necessarily follow rather closely the original values, and the 
use of the above formula would largely over-correct the series. 
Suppose, for example, we had a series of ten groups. We should 
require about five groups to obtain the general form of the curve, 
or to determine the constants of any frequency curve employed, 
hence the errors of the groups would only be reduced by the ratio: 


of approximately ye and the correction k as shown above should 


be reduced by half, and proportionately in other cases. 
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NOTE C. 


ON THE RATIONALE OF THE METHOD oF LEAST SQUARES. 


In statistical work it often happens that a number of constants, 
entering into the known mathematical form of a given function, 
have to be evaluated from a much greater number of observed 
values of the function. We may, for example, have three constants, 
such as %, y, 2, in the expression lx + my+nz=F, and fifty observed 
values of F (embodying different values of the coefficients 
1, m, ») from which to determine the constants. If the observed 
values of F were rigidly aceurate, any three of them, or any three 
‘combinations, would suffice to determine the constants, and it 
would be immaterial what set of three was selected, since all would 
lead to thesame results. But generally the observed values of F 
will be affected by errors of observation and hence will not be 
strictly consistent; and taking the above example each of the 
50 x 49 x 48 
6 
would in general produce different values of the constants: so that, 
apart from the prohibitive amount of labour required in the solution 
of so many equations, we should have no means of deciding which 
was the best or most advantageous solution, or how to combine the 
solutions in order to obtain the best average results. The method 
of least squares supplies the means of combining the original observa- 
tions in such a manner as to produce a number of equations, equal 
to the number of unknowns (in the above example, three), thé 
solution of which by the usual process leads to the most probable 
values of the unknown constants. 

Suppose that the observed function F is a linear function of the 
variables 2, y, 2, of the form la+my+nz..., and that the errors 
in the observed values of F follow the “normal law”, so that the 
probability of an error & is proportional to e~*, where the standard 


=1960 different sets of three individual equations 


deviation of F is aF . We shall further suppose that the equations 
J 2 
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have been so weighted that the value of ¢ is the same in each of the 
observations, or that the “precision” is uniform. Thus, for 
example, if in a given equation the probability of an error of / in the 
observed value of F is proportionate to e~*/t, with a standard 
¢ 
Cy 


deviation of es then multiplying the equation by —, we shall 


have an equation with a standard deviation of 5 as before, and 
the probability of an error & will be proportional to e~*'” as 
required. 

Let there be ¢ equations as follows (where ¢ is supposed greater 
than the number of unknowns, say s) : 


hatmy+tmyzt+... -mF=h 


a A on a Wok ko (A) 
Le+myt+mwz+... -wF=h 
where F represents the true value of the observed function and 
ky, ko,... the errors of observation. The chance of the errors being, . 
by hypothesis, respectively proportional to e~»/*, e~*’/°_. .. the 
chance of the conjunction of these individual errors will-be propor- 
tional to e~™"/?+''+.--1 which will obviously have its greatest 
value when the quantity in brackets isa minim»m. Now, the most 
probable values of the constants will be those that give the greatest 
probability of the observed event, i.e, the happening of the given 
combination of errors. Thus, the most probable values will be those 
making [h;7/c?+h2/e?+ ...J or Ske/c? or Sk? a minimum—hence 
the name “method of least squares.” 
Now we have 


De = (he + MY +MZ+ ... —wE)*) 
and since 2, y, 2... are supposed to be independent, the minimum 
value must correspond to such values of «, y, 2. . . as will make the 
partial differential co-efficients of this expression, with respect to 


ty, 2..., all vanish.* Hence we must have, omitting a 
common factor 2, 


Mhet+mytnagt+ ... —w,F)]=0 
Sm (lz +myt+ne+ ... —wF)]=0 (B) 
Sn(le+myt+me+ ... —w,F)]=0 
&e., &e., &e. 


* These conditfons, though necessary, are not in general sufficient to ensure a 
minimum, but in ‘this case it is obvious that a minimum exists be 


; cause high 
negative values and high positive values of x, y, 2... alike give large values to 
the function. 
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(the summation extending to all values of #) as the system of 
equations, s in number, for determining the most probable values of 
TY 2 Hence the rule : 

First prepare the equation by multiplying each by its proper 
weight (the reciprocal of the probable error or standard deviation), 
thus giving a set of equations with a uniform p.e.ands.d. Multiply 
each equation by the coefficient of z and add all the results together ; 
next multiply each by the coefficient of y and add all the results 
together, and so on: the resulting aggregate equations, solved in 
the usual manner, give the most probable values of the constants.” 

It will be seen at once that if there is only one constant to be 
determined, the method based on the normal law of error gives 
the weighted average, i.e. the total of the weighted values divided 
by the total weights, as the most probable. Conversely, it may be 
shown that if the weighted average is the most probable value, then 
the facility of error must follow the normal law. Apart, however, 
from any hypothesis as to the law of error, it may be shown 
mathematically that the method of least squares gives results which 
become more and more nearly accurate as the number of observations 
increases. Considerations of a more general kind will also lead to 
the conclusion that the method must produce very good results. 
Without g¥ving any definite form to the law of error, it is obvious 
that large errors are less probable than small, and that the most 
advantageous systen? of values for the unknown constants will be 
that which produces, on the whole, the smallest numerical deviations 
(irrespective of sign) between the adjusted and observed values of 
the function. Now, if the law of error is supposed unknown, we 
‘cannot investigate mathematically the conditions required to produce 
a minimum deviation irrespective of sign; and the simplest function 
of the errors which is independent of sign is the square of the errors, 
which will be the same for a positive or negative deviation, 
and at the same time attributes a rapidly increasing importance, or 
disadvantage, to the errors as they increase in magnitude. Hence 
we can see, in a very general way, that a method which gives 
a minimum value to the sum of the squares of the errors, is likely 
to lead to satisfactory results consistent with elementary notions as 
to the nature.of the errors. Moreover, in actuarial work we usually 
have to do with numbers sufficiently large to make the normal law 
of error very near the truth. 

Reverting to the system of equations (B), it will easily be seen 
that if F is a parabolic function of the form z+ay+a7z+... the 
equations for determining 2, y,2,...- &c., are, equivalent to 
reproducing ZF, 2aF, 2a’F (2wF, wF'.a, &e., if the equations are 
weighted), i.¢., the successive moments of the observations. 
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It has, so far, been supposed that the function F is a linear 
function of the constants z,y,2,.-. If this is not the case, 
suppose that the equally weighted equations, from which the values 
of z, y, z,... are to be found, are of the form 


f(a, Y, Fees )— mF =k, 
Frolx, 1) Pee aren ) — woF = ke oo “2. J UART Be Se (C) 
XC., &c., &e. | 


where f;, fo... are known functions of the variables 2, y, z 

By means of ¢ of these equations, or of ¢ combinations from 
amongst them, or otherwise, find approximate values of 2, y,2... 
say z',y',z'...; and suppose that x=2' + 82, y=y' + dy, z=2' + 82, 
&c,, where it may be supposed that dz, dy, 6..., representing 
small corrections to be found, are so small that their squares 
may be neglected. Then if 


A=Ale', , 2 oan 4p th = fil, y = z ee | &e. 


and so on, equation (C) will become 


serif) +n Gf) a8) mPa 


fod, a0) ay of) 48s wy bead Kalres (D) 


&e. &e. &e. 


These equations are linear functions of the small corrections 
du, dy, 82... which can accordingly be found by the rules ont 
Sgpabet and hence are found the corrected values x=2! + dz, 

=y' + dy, &c. The process can be repeated, if greater accuracy is 
desired, until the corrective terms become insignificant. 

In the important particular case of a graduation by Makeham’s 
formula, the original equations are of the form 


(w being the “ weight”). Approximate values of the constants, say 
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A’, B, ¢, being found, the resulting equations for determining 
8A, 5B, & are as follow, u', representing A’ +B'c?: 


: Slane yy 
(Sw)8A + (Sw.c'°"2.)8B + (SuB” 2at 5)é 


= (Sw)(w24} — u'e4}) 
a+) ons) 22+1 ~» e 12a u 
(Sw.c 2)8A + (Sroc'2**1)8B + ( Sw. Bie? 2 + 5 8c 


— : 
= (3w.¢ *)Get — H a41) 


a 1 
(Sw. Ba + Le 2)§A + (Sw. Bat : 2*)5B 


is ae Ay '9z-1\, 
+ | =w.B tus c se 


v , 1 ,t—2 
= iw. Bat se (Hee do 2+1) 


_ For an example, see J.J.A.; xvii, 161-71. 
4 > 


NOTE D. 


ON THE UsE OF THE BINOMIAL CURVE TO REPRESENT 
4 ConTINUOUS SERIES. 


im 4 ; 
Ir the Binomial curve y= wince oe “made high contact with 
the axis of z at the points z= —1 and «=n+1, where y becomes 


zero, it could be conveniently employed to represent a continuous 
curve in lieu of representing merely isolated ordinates; as in 
that case the moments of the continuous curve would very 
closely agree with those of the isolated ordinates. The same 
would be true of any series of equidistant points on the 
curve supposing these to be fairly numerous. If, for example, 
we suppose the values of y tabulated for every integral value 
of zh, then the ith moment of the curve would be increased 
by multiplication by the factor A', and from tht observed 
numerical values of the first 4 moments h and the remaining 
constants could be obtained. As, however, the curve y cuts the 
axis of « at an angle at both limits, this method of proceeding 
will lead to approximate results only when n is fairly large. 

The area of y treated as a continuous curve may be approxi- 
mately determined from the well known approximate formula 


n+1 b | 

| y d= ger Poors « orgnet = Unt + 35 (2) | 
=-1 feel | n+1 
y-1 and Yn+1 being of course equal to zero and the series yp +9; +... Yn 
is the expansion of (p+g)” where we assume p+q=l, and is 


1 
therefore also=1. As the factor le vanishes for «= —1; and > - 
: ~ 32 
vanishes for s=n+1, we have 
dy ( In ‘ee 2 
(@ ve = = gs pant =e na? Agate 


dy\" (2 ee Pics ie 
=| -e Seg ioe mM+1.-1 
dat) ny le? q a eee he po" 
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ince a as is k 1 
ks when c~—-1. Hence the area of the 


curve y becomes 


[or vae= thee S . 5 ale dai 
0) }2 n+l byt] ) 


Analogous expressions can be found for the approximate value of 
the moments 

Saydx . fa? ydx 2 

Suda’? fyde > “® 


but the relations which result do not lead to sufficiently convenient 
formulae for practical use. 


NOTE -E. 


ON THE RELATIONS BETWEEN THE SUCCESSIVE MOMENTS AND 
THE SuccESSIVE SUMMATIONS OF A SERIES. 


THE relations given on p. 60 may be systematically demonstrated, 
and developed to any extent that may be required, by means of the 
ordinary interpolation formule combined with a table of the 
power-differences usually known as the “ Differences of Nothing” 
—see Text-Book, Part II, Ch. xxii, Art. 11; Sunderland’s “ Notes 
on Finite Differences ”, pp. 24-5. 

We have, by the ordinary interpolation formula, 


p= + 2A) + ~ = 


A*ny + Fase 


9 


A 
and hence wUz. Vz = Uy Vp + TUz. AN + 


‘ice E, 
- i, A tg +. 
4 


so that L(g) = (Trz)V + (Taz). Avy + 3( 221 a es 
ZS " 


= (Suup)u + (S2u,). Avg + (Su). A? +... 
using the notation of p. 60. 
Put Vz, =x" and we have 
Sau = (Srip)0™ + (32x, )A0™ + (Z2rg)A?. 0" + 


Putting m equal successively to 1, 2,3 ..., taking the differences 
from the table of the differences of nothing, and noting that the 
first term vanishes whatever the value of m, we can write down 
at once— 

De. te = Duy 


pr Ug = Duy +22 Sus 


Sa te = Duy + 6S ry + 6S tug UA: 


Sat ttg = Tuy + 14.2% ue + 36S + 243, 
Sa ua = Duy + 80E%ug + 150 Sug + 2402%u, + 120 Sus 


These equations, divided by up give the expressions for the 
moments set out on page 60. 


Taking next the usual central difference formula, 


eattan+ Sht Sea), +oe=Vy, 
es UES J 
ier Pees 
es |? 


cca in +(e Date) 
2Yido 


4 {@+2)e+ Dale - 1} + fe + Late e—1le— 9), 
2 4 


a 


Thus, 
lis (¢ — 1)ue + S(x+ Lz ue} 


X(uatr) = (Stz)ty + (Sxur) a9 + 2 {=z 
‘ + {3(2+ 1)2(z- 1)uc} _ x 
= (Bup)r9 + Dty a + 5 (Sug + Eu) bo + Tus. C9 

A 5 (S's + 3us) do Ain | 


the law of the terms being manifest ; or, abbreviating the expression 


4 3 ate + Suz+1) 


by the single symbol =’u,+3, the series may be written 
T(uzte) = (Zuo)vg + (Bay)ag + (SPama) bo + (Zitz) €o + 
Putting tz =2”, forming the central differences of 2” as shown 


in the scheme below, we write down at once 
Sate = Duy, ; 
Sue = 25°uy3 
Saute = 6S + Sy : 


Sots = 2425 toy + 2D2uy 


The simplification in the formule is, of course, due to the fact 
that when m is even the odd central differences vanish, and when 
m is odd the even central differences vanish. 


It is sometimes required to find moments of the form 
1\” ¢ 
Ud x (w+ 1) . 


r 
For this purpose we may use the formula (see “Sunderland’s Notes 
on Finite Differences,” p. 32)— 


He 1) ar(a; — -1)(@- #) 2 


te= 3 (uy +0) + (a - $)Aty + a vy 


ZA + 0-1) + 


seeaNee enn 4g 10a) tan 


=F lo+n)+5 5 (eta Aes Mad D1 a2 bh.) 


S (2+ Yale — 1) +a(a — 1)m— 2) a3, 
2 Is 
4. @4 Hat =e = 21 AN. fou 


whence we find, in the same manner as before, that commencing 
with 1w,, we shall have 


+0 
De. We = Dw. 2 3 + St. At +3" Wy. 1 (+ Ao. 1) 


- 


+ BuayjA%o-1 + Buy > (A! -1 + Atv-o) + Peters 


g2r—1 (2z-1)' a Pay 


Putting 


the following Table shows the values of (22 - 1)” and its differences, 


whence tz and its differences will be found by diy pone Dyer: 


A a3 (22-1) AAs Betray 


A & iS IRS 
=e —-5 = 25 | —125 | 625 
2: -16 | 98 —544 
=e —3 9 8 — 27 = G2 are = SL 464 
2| -—$8$ -{ 26 48 | — 80 —384 
ae ie ae ae 8S - 1 —24 jj (1 80 38+ | 
0 2 (1) O (8) (0) 2 (0) 48 (1) 0 {80} 0 (384) 
5 ee ee: 1 eee +24 | 1 0) 354 
2; Bet 26 48. 80 384 
+3 +3 9 8 + 27 +72 tat Ot 464 
2; 16 98 - 644 
+5! +5 : 25 +125 625 


Dividing by the appropriate power of 2 and inserting the values 
of : (v% +11), Aro, : (A°r + A®x-1), &e., 


the last formula becomes 


Wirtlz =3(251 wz, commencing with (E)"e, 
e =(whenm=1) Ywy ; 
= (when m=2) 2>%u, +3201 
=(when m=3) 63%w3+ rk 3 ie a oO 
= (when m= 4)245%0ts + 5312 + a Dery 


Writing now w}=%, and so on, i¢., reckoning the ordinates from 
me 3 m 
zero, so that the moments are of the form (2) +ux(3) +o... 


these become 


N -M3 = 62'us + isu 


ee 
al 


N mtg = 242°} + BD mit a = 
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This will be made clearer by a numerical example. Take the 
following series. 


\ 
Distance 
from origin A 
multiplied Uz, x a? Uzx a3 Ue x a4 


———————— _ | —_—___—____— 
ed 


16°74 16°74 16°74 
141-21 423°63 1270-89 
367-50 1837-50 9187-50 
636°51 4455°57 31188°99 


116196 | 6733-44 | 41664-12 
Hare +8= isn 
290-49 841-68 2604°01 


i | | ff 


137-73 204°39 


(135'525) 
53°67 66°66 | 79°65 
35 | 1299 | 1299 1299 | 1299 | + 12°99 12°99 


| 
Yu = 114-12 
OSS + tu 2x 137-734 ; x 60°12 


= 275-46 + 15-03 = 290-49 
Su, = 6 x 135-525 + ; «114-13 

= 813-15 + 28-53 = 841-68 
2435 ey + 5th} + iu 24 x 79-65-45 x 137-73 + * x 60-12 


= 1911-60 + 688-65 + 3-76 = 2604-01 


With a heavy series of terms, the saving of labour by the 
summation method will, as may easily be seen, be very considerable. 
A further saving of labour may be obtained by calculating the 
moments round some convenient central point, and thus breaking 
up the series into two Sees in the manner indicated in 


Mr. Elderton’s treatise, pp. 22-33; and any of the formule described 
in these notes may be applied in this manner. 
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NOTE F. 


‘ON THE IDENTITY OF THE METHOD OF MOMENTS AND METHOD 
OF LEAST SQUARES IN THE CASE OF AN EXPONENTIAL 
FUNCTION. 

SUPPOSE y an exponential function of z so that 


y= eit trbat c+ &e, é, say. 


Then if y be taken to represent any group in a frequency distribu- 
tion where the number of groups is large the probable error in y 
‘will be approximately Vy. Assume the true values of y, 2.¢., the 
true values of a, b,c..., to be approximately known, and let the. 
observed values of y,be denoted by y. Tf, then, we weight each 


equation y' —¢=0. by the factor > writing 
y 
SL a ET eae ee 
ent ae 


we shall have a series of equations of condition in which the 
probable error is in each case identical ; that is to say, they will be 
‘suitably weighted for the application of the method of least 
‘squares (see Note C., p. 117-8). 


We dy dy 
Writing y= (y+ da. a +d. é 


+ &.) 
=y (1+ 6a+2.80+2°.8¢, &c.) 
equation (1) becomes 
1 [y( photatl +oe)— Spd ke ee er o(2) 
A y 
and multiplying each aioe successively by the ,coefficients of 


6a, 5b, &e., 2.¢., by—, sir is 2, nerd , &e., and taking the sum of each 


Ey 
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set of products, according to the rules of the method of least 
squares, we get 
S[y(1 + 8a +2:.8) + 2°. 8¢, &e.) -— 7] =0 


S[ya(1 + 8a +.2:.8) + 27.8, &e.) —2.c7] =0 
&e., &e. 


as the system of equations for determining, according to the 
method of least squares, the small corrections to be applied to the 
approximate values a, J, ¢,... used in obtaining the approximate 
values of y. 

Now, obviously, if y is so taken that 


Sy-2) =0 
Sa(y-—¢) =0 
Lar(y — @) =0, &e. 


i.¢., if the values of the constants a, ), c are found by the method 
of moments, &c., the above equations are satisfied by a= 8) = 8c¢=0; 
that is to say, the corrections are zero, or the values found for 
a, b,c... by the method of moments are in conformity with the 
method of least squares on the assumption that-the observations are 
properly weighted by multiplying by the factors Te the weights 
being assumed invariable. It may, however, be supposed that small 
variations in the constants, a, }, c, ... would produce slight 
variations in the weights, in which case other solutions may exist 
which would also lead, by the method of least squares, to equations 
satisfied by Sa=8)=8c=0; but as it is well known that small 
differences in weights have practically no effect on the results, it is 


evident that any such alternative solution must be very close to 
that already formed. . 


NOTE G. 


ON OBTAINING THE VALUE OF MAKEHAM’S CONSTANT ¢ 
DIRECT FROM THE EXPOSURES AND DEATHS. 


AS stated in the text an exact value of this constant is not very 
important, and this may be illustrated by reference to the data for 
ascending premium assurances given in Table X. An approximate 
value for ¢ may readily be found by a process such as the following, 
which is in principle analogous to the aggregate method employed 
by Mr. King in the Text-Book, Part II. Take the values of u for 
the central age of each group in Table X. Reject the initial and 
final values, as depending upon only two and three deaths respectively. 
Take the s® values for central ages 324 to 574, weighted respectively 
by the factors 1, 3, 5, 5, 3,1; weight the six values for central 
ages 474 té 724, and also for 624 to 874 in the same manner. We 
shall then have the following totals : 


ps, X 1 =°0119 
bars x 3= "0345 
pany x 5 = "0655 
Bary * 5 = 0685 
Bsa, x 3 = "0534 


bs, x L= "0232 


Bary X 1 ="0137 
f593 * 8 = "0534 
Ys7y x 5 ="1160 
Hoo, x 5="1700 
bers x 38 = "1647 


M724 % 1 =°0732 


S. = 5910 


p64 x 1 =*0340 
Mer, x 8= "1647 
L724 X 5 ="3660 
Birr * 5 = "5720 
Heey x 3 = "7540 


per, x 1 =°3379 


Sj = 2'1286 


If the mortality follows Makeham’s law, we shall have 


S, ~8, = 15376 = lb 
Sz = 8; -3340 


since 15 years is the interval between the centres of our empirical 
groups. This gives log c=+0442 nearly. If we take the sum of 
K2 
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the unweighted values of » in three groups for ages 323 to 474, 523 
to 674, and 724 to 874, we should obtain in similar manner. 


oo Seip aee 
S,-8, “0797 


We may conclude, therefore, that log c probably lies between 
044 and -045. The values of » for ages 274 and 924, which we 
have omitted in the foregoing, are respectively much below and 
much above the general curve. If these values had been 
included duly weighted, we should have obtained a slightly larger 
value of log ¢, nearer to -045. 

If we adopt 045 as an approximate value, we obtain for the 
values of the constants A and B, by the process described on p. 65, 


A =:00950 B=-00003712 


- Wewill call this curve (a), the deviations from the adjusted values 
of -@ in Table X being shown in the Table below. We might 


giving logc=-0443. 


Ascending Premium Assurance Experience. 


- Deviations in Computed Deaths for Curves (a) and (8). 
ts 


DEVIATIONS 
Middle <a. ee Ok AS es te 
Age Observed Deaths Computed Deaths — Observed Deaths 
of corrected as 
Group per Table (X) a : 
Curve (a) | Curve (8) 
| ' * log e="045 log c="046 
| 
| . Rage po : 
273 8 ‘9 ie het ety ae 
323 29° e 35 | a 28 
373 1020 | sf 18 52 2. 
42h 175-2 2 ie bere 5-9 
47% 191°7 128 er Ta - 
52h 218°6 3:3 i ae ul 
574 228° 73 er eee S 
|: 62g 255°4 tt re ote 52 
674 274°4 mo, 24-8 ° | . 26°5 
72h 2056 12:0 2 12-0 a 
77h _ 1515 12-1 w | 186 is 
824 84:8 66 51 
an 22:3 5 ~ 
92} 21 Ll Ll 
| Sum of deviations 48-4 48°3 | | 46°9 468 
| ‘ 
Second sum , .| 389 37°5 | 39'3 | 39:9 
- * | 
a —'_—_____.- 
Plas Third sum. . | 13-7 718 i: a9 | 369 
I 


————— 
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expect from our first rough approximation to log c that a smaller 
value than -045, say -044, would give better results. We find, 
however, that the third sum of the errors of the (a) curve is 
negative, and this indicates an increase in the value of log c. 

Since a higher value of ¢ hollows out the curve at the middle 
ages, increasing the computed deaths at the extremes of the table, 
it is clear that the effect must be to increase the third sum of the 
graduated deaths. 

The probalility is therefore, that curve (a) will not be much 
improved by changing the value of c. 

If we take the alternative value log c=‘046 we find the 
deviations from the adjusted values of 6 in Table X are given for 
curve 8 on the previous page- 

There is little to choose between the two graduations, notwith- 
standing the smallness of the third sum of the deviations in curve 
(8), for against this may be put the fact that the three largest 
errors in (a) are all increased in (8). On the whole the curves may 
be taken as showing that an approximate value of log c is generally 
sufficient, and that nothing is gained by computing this constant to 
several places of decimals. , 

It may at first sight appear inconsistent with the general theory 
to adopt “alues of the three constants which do not make the third 
sum vanish ; z.e., the third moment of the graduated and ungraduated 
figures idehtical. 3% must, however, be remembered that the method 
of least squares (and with it the method of moments) assumes that 
the form of the curve is known a priori, in which case the method 
gives the means of determining the most probable values of the 
constants involved. When, however, we are dealing with a 
mortality experience, we have no a priori right to assume that 
Makeham’s law is strictly applicable; and, if it is not, the 
deviations instead of following the normal law as assumed in the 
theory of least squares, will include systematic deviations due to 
departure from the Makeham law. In these circumstances the 
‘method of least squares is not strictly applicable, and we are 
therefore justified in allowing other considerations to guide us in 
selection of the constants. 


We may here note that if the exposures are represented by a 
frequency curve, the deaths being recomputed to correspond to the 
graduated exposures, then the value of loge may, in general, be 
calculated from the moments of the exposures and of the recomputed 
deaths. ‘This can readily be done if the exposures are represented 


134 


by a binomial curve (see Calderon, J.J.d., xxxv, 157,, although 
precautions must be taken so to group data that the number of 
terms in the binomial is not great—not more, say, than five or six ; 
or by the normal frequency curve (see Elderton’s “ Frequency 
Curves”, pp. 98-100); or by the curve y=kx"e~”, where, if Ey, Ky, 
&c., represent the successive moments for the exposures round the 
origin, and 6, 6, &¢., the similar moments of the recomputed 


deaths, 
(2: _ 2) 
we shall have yalog.c _ \Ey  Eo/ 


y (2 . 4) 
a Ky 
whence, y being known, log,¢ is easily found. 

The above relation may be thus demonstrated. The force of 
mortality at age z is assumed to be of the form A+ Bc* = A + Be*"*** 
=A+Be, putting A=log,c. Thus the death curve will be of the 
form A. hae" + B. hait’e- 9-4 where the second term is of the same 
form as the first with y—A substituted for y. But by the well- 


known properties of the Gamma integral (see Williamson’s 
“Integral Calculus”, Art. 120) we have 


is] 2 2 
Se mi cen ae 
| en "dy = = i ety t ae % 
0 e/ 90 


o 


je eu . PEC i 
whence it is easily seen that, writing EF’), E,.7. for the moments 
of ke“ 


Ky = Eo A == AE) Be BE’y 
i, = Eo x ae 6; = AE, + BE’, m+ 1 
oe y-xX 
Bye Thy x URANO |) 65 AR BH eee) 
¥ : : (y-A)* 
whence 
6)-+Ey=A+BE9 =A+B 
Ky 
A 3 eee 
E’ ets 
6, —E =-A+B=9 a ayy RB’ y 
: : Ey y- i y-X 
pg oe 


EK 2 2 
6,4 Bye A Bt eee a —.) 
Ey (y — A)? ¥ y-xX 


so that AN Sytner es Roos 
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If the exposures, as often happens, can only be represented by 
a curve of the form y=/2*(1 —2)* (where z represents a propor- 
tionate part of the range of the curve so that # ranges between 0 
and 1) and if, as before, we represent the entoestca: moments for 
exposures and deaths by io, ;, &c., My, My, &c., where my and My are 
made = 1, then writing es 


(a+1)—- (a+ B+2)M,=R, 
(a+ 1)M, - (a+ B+ 2)M,=R, 


it will be found that, putting y for the range of the curve in years 
of age, 


ee ie ee 
(Mg — Mz) — h(img — ia) 


from which as the numerical value of all the quantities except 
_ log, cand h are known, these two may be easily found. 

This may be shown as follows :— 

Let the curve of exposed to risk he represented by the type 


y =hat(1 —.x)P 
where the entire range of the curve is taken as unity, and assume 


I, a, and B to be determined in the usual manner. 
Let the curve of the recomputed deaths be of the form 


Alo*(1 — x)? + Blat(1—w)Pev® = Ay+ Bs . . . (1) 
ie., we assume that — £ (log Lz) = A + Be” 


2) 


As regards the curve z, we shall have 


logs =a logz + Blog (1 - x) +x 


dz -(¢ ele +7): 


ae Ne L-2 
or, multiplying both sides by w*1(1 — 2x), 


Git Aye =[aat — (a+ Bott +4 y(at? ate 2. (2) 
ee Che: 
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Integrating the left-hand side of this equation by parts, and 
noting that the factor (#'*! — 2'**) is zero for the limits 1 and 0, 


1 
| le 4 2)att! — (t+ 1)a*Jeda = | . [aot — (a+ B)a?t! + y(att! — af?) lodx 
0 « 
that is 


(¢+ 2)m'e4i — (E+ 1)’, = am, —(a+ B)m'e+1 ar yn pa = M42) 
and 


(a+t+1)m',—(a+B +t+2) m0 p41 + ym'par - M42) = 


| 
o 


m3) 


there m; represents the ¢th moment of the curve z round the 
ordinate 2:= 0. 


If y=0, the curve z becomes identical with y, and writing 2, 
for the ¢th moment of y round the ordinate 2:=0, we have 


(@tt+ lm = (e+ 8 + §8)n =O... 2, 


Write, as before, the total of the exposed =E,, and of the 
deaths = 6), respectively, and represent the total of the exposed 
multiplied at each age by the factor e” by Ey. 

Let E; and @ be the tth moments of the curve of exposed to 
risk and of the recomputed deaths, the areas of the eurves not 
being taken as =1, but having the values E, and @ above defined, 
that is to say, representing the total exposures and the tetal deaths. 


And let E’, be the th moment. of the curve of exposures multiplied 
at each age by e”. 


Then we. have O=—AR,+BE, = 4. <4. 5 2 nee: 


where 6; and EK, S known, but the remaining quantities unknown. 
From (3) and (4) * 
(at+é+DE,—(0+R+t+2)Biyy=0 . 2 2. 2 1. 6) 
and (a+t+1)E,-(a+ B+t+2)E yay ty(Bit - Bite) =0 . (7) 
Write (a+t+1)@—(a+B+i+2)641=R,, 
from (5) 


(a+¢+1)(AB,+BE's) -— (a+ 8 +4+ 2) (AE,+, + BE',4,) =R, 
and from (6) | 
(a+¢+1)BE,—(¢+8+t+2)BEi4=R, 
and from (77) 
a ee eee By(Bite-Evti)=Re ww ww. (8) 
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Since from (5) 
; BE',=6,- AE, | 
we have y[(O2+2— O41) — A(Evas Aired len tne cece : ment) 
writing ¢=0 and ¢=1 respectively, we get. 

y[(2 — 0) - ACE: — E,)] = Ry 

y[(63 — 62) - A(E; — E,)]=R, 
whence Ril(4. — 6;) — A(E, — E:)] = Ry (63 — 62) — A(E3 — E;)] 
4 es os es 


also, from (9) y= eee s ® 


The value of B cannot be determined directly from these equations 
as it enters symmetrically with the values of E’,. It is therefore 
necessary, having found the value of y, to compute the value of E’y 
and thence deduce B from equation (5). 

Unless the mortality follows Makeham’s law very closely hotter 
* results wilt be obtained by calculating both E’, and E; and obtaining 
values of A and B satisfying the equations 


AEy + BE’, = 60) 
AF, + BE, = 6, 


a 


Fa y 
‘01 | -01128 
02 | 02256 
-03 | -03384 
04 | 04511 
-05 | -05637 
06 | -06762 
07 | -07886 
-08 | 09008 
-09 | -10128 
10 | +11246 
‘11 | 12362 
12 | -13476 
13 | -14587 
14 | -15695 
15 | 16800 
16 | 17901 
17. | -18999 
‘1g | -20093 
19 | -21184 
20 | -22270 
21 | -23352 
29, | -24430 
23 | -25502 
24 | -26570 
25 | -27638 
26 | -28690 
27 | -29749 
-28 | -30788 
-29 | -31828 
“30 | *32863 
‘31 | -33891 
32 | -34918 
33 | +35998 
“34 | 86936 
‘35 | -37938 
36 | -38933 
‘37 | -39921 
38 | -40901 

| °39 | -41874 

| 40 | 49839 
“41 Bad 
“42 | -AM747 
43 | -45689 
“44, | 46628 
45 | -47548 
46 | +48466 
“47 | +49375 
48 | -50275 
49° | -51167 
‘50 | °52050 


a ae a 
ee a Om 
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2: z — 22 
es | ues 
Tables of Values of y fe i z 
A z y 4 = 
1128} 51 | -52924 | 866 1-01 
1128 +52 | 58790 | 856 1-02 
1127 ‘53 | 54646 | 848 1-03 
1126 +54 | -55494 | 838 1-04 
1125 -55 | °56332 | 830 1:05 
1124] -56 | -57162 | 820 1°06 
1122 57. | 57982 | 810 1:07 
1120] -58 | -58792 | 802 1-08 
1118 *59 «| 59594 | 792 1-09 
1116 ‘60 | -60386 | 782 1:10 
1114] -61 | -61168 | 773 111 
1111 “62 | 61941 | 764 112 
1108 “63 | 62705 | 754 113 
1105 64 | -63459 | 744 114 
1101 “65 | -64203 | 735 115 
1098 | -66 | -64938 | 725 | 1-16 
1094} °67 | -65663 | 715 117 
1091 “68 | -66378 | 706 1-18 
1086 -69 | -67084 | 696 1:19 
1082} -70 | -67780 687 | 1-20 | 
} 
1078 “71 | -68467 | 676 1-21 
1072 72 | -69143 | 667 1:22 
1068} -73 | -69810 | 658 | 1:23 
1063 ‘74 | 70468 | 648 1-24 
1057 "75 | °71116 | 638 {25 
1052 76 | °71754 | 629 1:26 
1046 ‘77 | -72382 | 619 1:27 
1040 “78 | 73001 | 609 1:28 
1035 ‘79 | -73610 | 600 1:29 
1028 *80 | -74210 590 1:30 
1022 “81 | -74800 | 581 1°31 
1015 *82 | 75881 | 571 1°32 
1008 ‘83 | -75952 | 562 1:33 
1002 "84 | °76514 | 553 13d 
995 °85 | °77067 | 543 1°35, 
988 86 | -77610 | 534 1:36 | 
980 87 | -78144 | 525 137 | 
973 *88 | -78669 | 515 138 : 
965 °89 «| -79184 | 507 1:39 | 
958} -90 | -79691 | 497 | 1-40 | 
1 
950] 91 | -80188 | 489 1-41 
942 ‘92 | -80677 | 479 1:42 | 
934] ‘93 | -81156 | 471 1:43 | 
925 "94 | -81627 | 469 1:44 | 
918 "95 | 82089 | 453 1°45 
, 909} 96 | -B2542 | 45 | zag | 
900} -97 | -82987 | 436 | 1-47 | 
892 "98 | -83423 | 428 1°48 
883 “99 | -88851 | 419 1:49 
8741 1:00 | 84270 411 1:50 


*89910 
“30200 
"90484. 
“90761 
*91031 


*91296 
“90553 
“91805 
*92051 
*92290 


"92524 
92751 
°92973 
“93190 
*93401 


“93606 
“93807 
“94.002 
“94.191 
94376 


"94.556 
"94731 
“94902 
*95067 
“95229 


“95385 
“95538 
‘95686 
“95830 
‘95970 


“96105 
96237 
“96365 
96490 
96611 


Table of Values of y= —s 
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e~” de—continued. 


Va 0 


a a ee ee 


ar y 

; i 
151 | 96728 | 
152 | “96841 | 
153 | -96952 | 
1°54 | ‘97059 


155 § ‘97162 


156 | ‘97263 
1:57 | 97360 


! 
158 | -97455 | 
159 | -97546 | 
1:60 97635 | 
161 . 97721 : 
1:62  °97804 | 
163. ‘97884 | 
164 © 97962 | 
165  -98038 | 
166 “98110 | 
167 98181 | 
168 ° -98249 | 
169 -98315 | 
170 98379 } 
171 - °98441 
1-72 98580 : 


1-76 . 98719 | 
1-77 | -98769 | 


1-78 | ‘98817 | 
1-79 98864 
1:80  -98909 | 
1:81 « -98952 
1:82 © “98994 
1°83 i -99035 
184 i “99074 | 
1:85 | 99111 | 
{ ‘ 
186 . ‘99147 
187 + 99182 
1:88 «© 99216 , 
1:89 . 99248 : 
190 . ‘99279 
1:91 - -99309 : 
1:92 99338 
1:93 : -99366 
1°94 ; -99392 | 
195 : 99418 
1:96 . 99443 
197  -99466 
198 © 99489 ! 
1-99 § 99511 


A 


= 


4s 


: °995525 
: *995720 
; 995906 
' -996086 
. "996258 


“996423 
"996582 
“996734 
“996880 


! oe | 
; “997155 | 


997021 ° 


i °997284 | 
: *997407 © 


, 997525 
, 997639 


*O9TT47 
“997851 
: 997951 
"998046 


. 
® 


. 
8 


“999065 
“999111 


© 
12) 
© 
ory 
Or 


| *999155 
| -999197 
“999237 
| 999275 
| -999311 
| -999346 
| -999379 
| 999411 
| -999441 
*999469 


! "999497 
999523 


*999547 |. 


"999571 


999593 


A 


~] 


3°15 


3:20 
3°30 
3°40 
3°50 6 
3°60 


y A 
9996143 | 202 
| 9996845 192 

“9996537 | 183 
‘9996720 | 173 
9996893 ' 165 
' 9997058 | 157 
: 9997215 149 
' 9997364. 141 
‘9997505 | 135 
‘9997640 | 127 
9997767 | 121 
‘9997888 | 115 
‘9998003 | 109 
9998112 | 103 
9998215 | 98 
9998313 | 93 
-9998406 | 88 
“9998494 | 84 
"9998578 | 79 
9998657: 75 
| 
‘9998732 | 71 
-9998803 | 67 
-9998870 | 68 
99989338 | 61 
9998994 | 57 
-9999051 | 54 
9999105 | 51 
9999156 | 48 
‘9909204 | 46 
9999250 | 43 
acs 41 
-9999334| 38 
9999372 | 37 
9999409 | 34 
9999443 | 33 
9999476 | 31 
9999507 | 29 
9999536 | 27 
9999563 | 26 
-9999589 | 109 
-9999698 | 81 
9999779 | 60 
-9999839 | 45 
-9999884 |; 32 
9999916 | 24 
9999940 | 29 
-9999969 | 16 
9999985 | 8 
9999993 | 3 
‘9999996 


ee ee ee 


TABLE OF 


[The constants are restricted to positive 5 ar of ne ie value 


= ———— 


i % ' 
. CHARACTER OF CURVE Limits oF & | 


: [ 
1 a) fide i Mean Z ‘ | " 
| quati = ae i 2 : 3 
: a on | z Lower|Upper f | t 
| Shape Range Pp 
owen E>: si kc | Gas ? | 
Limited 2 re) | a | ra) 
I Syuanetsical both ka? —22)? =e} T#2 | m+l | 
| ways acl | 
78) LL Sides GE. SG. 1) Pky. ieee Speed | 
Fae | | 
E 7° | a 
; il Syeemenicel itea ke“ (Normal Curve) | | + 4 0 | 3a | 0 
| ete FOE ae ct ae a | 
i hoe 2 iD aes = wes” aaa 
, il Symmetrical Nite kia+a) St 6 |) - Wades - 
| (n> 3) . | 
eee Da re I SE Dee he 8 ee 
| | : | 
1 Limited | k(a—2)"P-1(a + 2)na-) a? | 16( p—9)p¢ 
| IV: Skew both 2 ‘a. +a (q—p)e a Tk 
| ways } (p+g=1) | | Pek et TS 2) 
oa | oe BS Ss ee Se Py ~---—-— | ee 
| | ; 
| \Limited 2. | : ee 5 
| Vv Ske lone way ze 1 a | 0 | +0 ma ma 2ma 
' ; \ 
pp eee AS ER ane eee al | Saf ER. 22 RE 
: Rene 'k(a— a)"P - Ma +a)- (nq+1) 
5 Limited | | 4pg , | 16(p+9)p¢ 
VI Skew [one way ett taj; +a (p+9)o| a-i” \G-Hiee 
1 n ‘ . | 
ee = are ae: nae a ee ee | ~ om 
; i | ' | 
Rn @ “ 
VIL} Skew |Limited) = za-intv—e = | 9 a ee 
; one wa | +0 aioe Va h\ae 
| ! " (n>3) | a | n=(n—1) | a3(n—1)(n 
oe ae a sk Ga br ers Be 
| Un- 0° a ($+1) vtan-1 =| va e ne + -¥* | 4y(n* + y*) 
| VItl! Skew | ing HC +2%) “2 teat esl! ee 6 oa 
/ (n>) t a(n —1) in! (n—l)(a— 


Noves :—8,=p,°+y,°. By=py-+p?- 


Skewness = (Mean—mode) +o 


ois oe » ee A ee 
Criterion = K = 4(2—3y)(4—3+) 
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*REQUENCY CURVES. 


(ie., all >0), and in ee III, VI, VII and VIII, x» must be >]. 


an be oar = Se Ve St | i. roe — 
| 
BL B=py +p? / Bo Fis in 
1 3 = fe Y +3 te 
aes ee at, i 
ee +i {3 +2 
(@+H@+3) : | (a wan - 
1 — 
Se ——---- | =a Oe de 
| 2 
—@=3u, | (6) | 3 i 3 0 
4 Beasts =ay fs 2 ee eae Ir ee 
4 1 2-3) 
3a : Lore ‘3 2-2 
_—___—- i ! 3( —— 
(n—1)(n—3) o FS 3 0 
: aie Vie 
i 2 n+3 | 1 
A8pq(2+n—6p9) | Aa +1(p—9)| 3[2+(n—-6)pg]mt+1) | 3 n+2 -(7- 
set li(a+2)\a+ 3° (n+: 2)*pq . (n+ 2)(n + 3)pq y 2 ( 
e 3 negati 
Sacco bor eye eee -~——-- Pe o> 
4 m+2 i: 

i p2.e-38| 4 
48pq(2+n+6p9) gs A(n—1)(p+q?; 3[2+(n+6)pq](n—1) 3 a—2 ns 
(@—-1)(n—2)(n—3) (n—2)*pq | (a—2)(m=3)pq BaF Gene 

3 
_—— Re | 2 n--8 
16(x—1) | | 3 »—2 
Not required on | Z 
(a 2) | = cod 
; ek 3 
. j : 2-8) 
srl la+ NaF) — Su") ,NGw—1), oF | aml a ++ —B0F}) B93 SE 
m'(n—1)(n—2)(2—3) “Woe? (~—2)* n+ (n—2)(u—8)(n* +) “ 2 soe. 
' 3 


Standard Deviation= “y2.=0. 


By (By +3) 
2(68.—66;—9) 


Bi(B2 + 3)? 


~ 4 2B, —8 By — 6) (4: (B2—3B;) 


Sill a 
te Fi 


Ronee a “ 


———— P— 
hy ; 
(oe 
4 ne = ba | i 
=! 2 [ = 
9 a are — as Te ’ 
7 <i a : 
—_— 4 ‘os as 
itn *) ~- Mo ’ 
eta Ty 4 je Arete 
- ~*~ 4 
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Se = . 
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